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INTRODUCTION 


Tus is the final volume of Prof. Smirnov’s five-volume course of 
higher mathematics, about whose history some remarks were made 
in the Introduction to Vol. I of the present English edition. 

The first Russian edition of this volume, published in 1947, enjoyed 
the distinction of being the first book in any language on the theory 
of integration and the elements of functional analysis to be written 
specifically with the needs of theoretical physicists in mind. Indeed 
nearly twenty years after its publication its only rivals would appear 
to be works by other Russian authors. 

Functional analysis arose as the result of generalizing various con- 
cepts and methods of classical branches of mathematics. Although it 
has become (in the manner characteristic of contemporary mathe- 
matics) a very abstract discipline, its general results can be used to 
derive the solution of particular problems in classical analysis and in 
applied mathematics. Its successes have been such that it is difficult 
to imagine that a strong light cannot be cast on the solution of almost 
any problem in mathematical analysis by the use of the concepts and 
techniques of functional analysis. Large areas of the modern theories 
of approximation, differential equations and mathematical physics 
are dominated by these methods and so research workers in physics 
and engineering need to become familiar with the ideas of functional 
analysis. They will find a clear and authoritative introduction to 
these topics in this volume, but it should not be regarded as of use 
to them only; students of pure mathematics will find here an account 
not only of the essentials of a flourishing branch of modern pure 
mathematics but also of its links with the past and of the motivation 
of much of the recent abstract work in the subject. 
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PREFACE 


IN MODERN theoretical treatments of mathematical physics great 
importance attaches to the theory of functions of a real variable, 
the various functional spaces and the general theory of operators. 
These subjects provide the essential material for the present book, 
which is based on the fifth volume of my Course of Higher Mathe- 
matics, published in 1947. 

The branches of the theory of functions of a real variable in the 
present book include the theory of the classical Stieltjes integral, the 
Lebesgue-Stieltjes integral and the theory of completely additive 
set functions. 

The first chapter discusses the theory of the classical Stieltjes 
integral, and also considers the more general definition of the Stieltjes 
integral over an interval of any type, based on the equality of the 
upper and lower Darboux integrals with a subdivision of the basic 
interval into intervals of any type. The Fourier-Stieltjes and Cauchy-— 
Stieltjes integrals are taken as examples of the classical Stieltjes 
integral, and inversion formulae are established for these. The Stieltjes 
integral is also defined for the plane case. 

The space C of continuous functions is also discussed in Chapter I, 
and the general form of linear functionals in this space is established. 

The second chapter deals with the foundations of the metric theory 
of functions of a real variable and the Lebesgue-Stieltjes integral. 
The whole of the theory is expounded for the case of a plane and the 
possibility of its obvious generalization to the case of n-dimensional 
Euclidean space is indicated. The theory of measure is built up on 
the basis of any non-negative, additive, normal function, defined on 
semi-open two-dimensional intervals. The Lebesgue-Stieltjes integral 
of a bounded function is defined on the basis of the coincidence of 
the upper and lower Darboux integrals when the basic measurable 
set is subdivided into measurable sets. Chapter II ends with a detailed 
discussion of an averaging process for functions and the properties 
of the mean functions, when the averaging kernel is subject to certain 
conditions. Wide use is subsequently made of the averaging process. 
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The third chapter deals with the theory of completely additive 
set functions. After proving the initial theorems, the theorem on 
the decomposition of a completely additive set function into a singular 
and an absolutely continuous part is stated without proof, and the 
fundamental facts relating to this decomposition are discussed. The 
case of a single independent variable is treated in detail. Also, an 
absolutely continuous set function is studied in the general case, 
and the formula established for changing the variables in a multi- 
dimensional Lebesgue-—Stieltjes integral. 

The third chapter ends with a proof of the above-mentioned theorem 
on decomposing a completely additive set function into two terms. 
Furthermore, the concept of Hellinger integral is introduced in the 
multi-dimensional case, and its properties are investigated. In particu- 
lar, the connection is established between the Hellinger integral and 
the Lebesgue-Stieltjes integral. The case of the one-dimensional 
Hellinger integral is analyzed in detail. All the proofs at the end of 
Chapter III are based on a preliminary detailed treatment of the 
properties of completely additive set functions [78, 79]. 

The fourth chapter contains an exposition of the foundations of 
the general theory of metric and normed spaces. It ends with a 
detailed discussion of generalized derivatives, embedding theorems 
for the various function spaces, and the theory of functionals in 
the space of continuously differentiable functions. All these questions 
are related to S. L. Sobolev’s well-known investigations. They are 
dealt with in his monograph Some Applications Of Functional Analysis 
To Mathematical Physics (Nekotorye primeneniya funktsional’nogo 
analiza vy matematicheskoi fizike) (1950). 

Generalized derivatives are defined in two ways — with the aid of 
the formula for integration by parts and by means of the closure 
of functions with continuous derivatives; the equivalence of these 
definitions is proved. Special attention is paid to the case of a star- 
shaped domain. Furthermore, the complete normed functional spaces 
WD) and WD) are introduced; the first of these consists of the 
functions p(x) that are defined in the domain D and have all generalized 
derivatives of order 7, where (x) and the derivatives in question 
belong to £,(D), whilst the second space consists of the functions 
g(x) that have all generalized derivatives up to and including order J. 
It is subsequently proved that, for a wide class of domains D, W(D) 
and W?(D) consist of the same set of functions, and that the norms 
introduced into them are equivalent. Moreover, fairly simple proofs 
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are given for space WD) of theorems that are particular cases of 
the embedding theorems for W{)(D). 

These theorems are first formulated, then a complete proof 
of them is given in fine print, on the basis of Sobolev’s integral 
form. All this material is closely related to the above-mentioned 
monograph. 

The final fifth chapter deals with the general theory of Hilbert 
space, the whole of the treatment being first given for the case of 
bounded operators. Fredholm’s theorems are proved for linear 
equations with completely continuous operators. They have been 
stated without proof for normed spaces. 

The relevant integral forms in terms of the differential solutions 
are given with the aid of Hellinger integrals for self-conjugate operators 
on @ continuous spectrum. Examples are given of the application of 
the general theory of bounded operators in J, and Ly. 

The final section of the fifth chapter is devoted to the theory of 
unbounded operators in Hilbert space. After proving the general 
theorems, numerous examples are given of differential operators with 
one and several independent variables. The general theory of extension 
of closed symmetric operators is followed by a discussion of the 
special case of semi-bounded operators, and in particular, of their 
Friedrichs extensions. 

The publication of a sixth volume is envisaged, dealing with certain 
problems of the modern theory of differential operators with one and 
several independent variables. 

In addition to specialized articles, J have made use of numerous 
books in preparing the present volume. The chief titles are as follows: 
V.I. Glivenko, The Stieltjes Integral (Integral Stilt’esa); I. P. Natanson, 
Theorie der Funktionen einer reellen V erdnderlichen; Saks, Theory Of The 
Integral (Teoriya integrala); de la Vallée-Poussin, Integrales de Lebesgue. 
Fonctions d’ ensembles. Classes de Baire ; Stone, Linear Transformations 
in Hilbert Space and their Applications to Analysis; N. I. Akhiezer 
and J. M. Glazman, Theory of Linear Operators (Teoriya lineinykh 
operatorov); A. I. Plesner, Spectral Theory of Linear Operators, I 
(Spektral’naya teoriya lineinykh operatorov, I) (Uspekhi matema- 
ticheskhikh nauk, t. IX, 1941); N. I. Akhiezer, Infinite Jacobian 
Matrices and the Problem of Moments (Beskonechneye matritsy 
Jakobi i problema momentov) (loc. cit.);S. L. Sobolev, Some Applicat- 
tons of Functional Analysis to Mathematical Physics (Nekotorye 
primeneniya funktsional’nogo analiza v matematicheskoi fizike). 
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CHAPTER I 


THE STIELTJES INTEGRAL 


1. Sets and their powers, The various concepts of integral play a 
large part in the application of mathematical analysis to present-day 
science, and we shall discuss in our first two chapters the theory of 
integration in a more general form than previously. As a preliminary, 
the present section contains a certain amount of elementary set 
theory, which is supplementary to that given in [IV; 16]. 

Suppose we have two sets A, and A,, consisting of objects of any 
type (elements). The sets are said to have the same power if a one-to- 
one correspondence can be established between the elements of A, 
and the elements of A,, i.e. a correspondence in which a definite 
element of A, is associated with each element of A,, and conversely, 
each element of A, is associated with one and only one element of A,. 
An infinite set (i.e. a set containing an infinite number of elements) 
is described as denumerable if it has the same power as the set of 
all positive integers, i.e. if its elements can be enumerated by means 
of positive integers: @,,a@,,a,,... Two denumerable sets have the 
same power. Let us examine some properties of denumerable sets. 
We consider the part of a denumerable set containing an infinite set 
of elements a,,@p,,..-, where p,, P,,... i8 an increasing sequence 
of positive integers. The elements of this new set are also numbered. 
The number of each element is the subscript of p. In other words, they 
are numbered in order of increasing subscripts p,, p,, .... An infinite 
part of a denumerable set is therefore a denumerable set. We now 
take two denumerable sets: A(a,, @,, a3, ...), consisting of elements 
@3, Aq, Ay, ... and B(d,, by, bs, ...), consisting of elements 0,, bp, by, ...3 
we form their sum, i.e. we combine the elements of both sets into a 
single set C. The new set C thus obtained is generally called the sum 
of sets A and B. This new set is also denumerable. For we only need 
to arrange the elements of set C say in the following order: a,, b,, 
ay, b,, ..., in order to see that C is denumerable. If there are identical 
elements a,, b;, we have to take one of them and strike out the re- 
mainder. A similar argument applies for the sum of a finite number of 
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denumerable sets, i.e. the sum of a finite number of denumerable sets 
is a denumerable set. 

Suppose we have a denumerable set of denumerable sets. The 
elements of all these sets can be denoted by a letter with two integral 
indices al. The upper index indicates the number of the set to which 
the element belongs, and the lower the number which the element 
has in the denumerable set to which it belongs. There is no difficulty 
in enumerating all the elements ow. We take as the first element 
the one in which both indices are unity: af). We then take the elements 
in which the sum of the indices is 3, and arrange them in order of 
increasing upper index. We thus obtain a{”, a‘? as the second and 
third elements of the sum of sets. We now take the elements in which 
the sum of the indices is 4, and arrange them in order of increasing 
upper index: af, af”, af. This gives the fourth, fifth and sixth 
elements of the sum of sets. It may be seen on continuing this con- 
struction that the sum of a denumerable number of denumerable 
sets is a denumerable set. This assertion would obviously still hold 
if certain of the component sets were finite instead of denume- 
rable. 

Let A be an infinite set. We choose any element of it and assign 
it the number one. The remainder of the set will be infinite, as before. 
We choose any element from it and assign it the number 2. On proceed- 
ing in this way, it will be seen that a denumerable set can be ex- 
tracted from any infinite set. The set remaining after such extraction 
may be either empty, i.e. contain no element at all, or may be finite, 
or infinite. Let us show that, if this remaining set is infinite, it has 
the same power as the original set, i.e. the following assertion holds: 
if, after extracting a denumerable set P from an infinite set A, an 
infinite set B remains, sets A and B have the same power. We extract 
from the infinite set B a further denumerable set Q, and let C be 
the remaining set. The original set A is now split into three sets: A= 
= P+Q+0C, of which the set C may be empty or may be infinite, 
whilst sets P and @ are denumerable sets. We had A= P+ 8B 
prior to the second extraction. A one-to-one correspondence is readily 
established between the elements of A and B; for we have A= 
=P+@Q4+Cand B=Q-+C. The sum P + Q of denumerable sets 
is a denumerable set, so that a one-to-one correspondence can be 
established between the elements of P + @ and @. We put every 
element of the set C in correspondence with itself. A one-to-one 
correspondence will thus be established between the elements of A 
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and 8B. A direct consequence of the assertion just proved is that, 
if a denumerable set is added to an infinite set, the new set obtained 
will have the same power as the original set. Both the assertions 
regarding the subtraction and addition of a denumerable set remain 
in force if the denumerable set is replaced by a finite set. The proof 
is precisely the same as above. 

We mentioned earlier [IV; 15] that either the set of rational numbers 
belonging to an interval [a, 6], or the set of all rational numbers, 
is denumerable. This is proved in essentially the same way as the 
statement that the sum of a denumerable number of denumerable 
sets is denumerable. The role of upper index is played by the numerator 
of the fraction, and the role of lower index by the denominator; 
it is necessary to start by considering positive fractions. Let us 
now adduce an example of a non-denumerable set. We take all the 
real numbers belonging to the interval [0,1]. We can write each of 
them, apart from zero, as an infinite decimal fraction with integral 
part equal to zero, and conversely, every such decimal fraction will 
correspond to a real number of our interval. We do not make use 
of finite fractions, since a finite fraction yields the same number 
as an infinite fraction having a 9 recurring, e.g. 0.37 = 0.36999.... 
Let us show that the set of these real numbers is non-denumerable. 
We use reductio ad absurdum. Suppose that all our decimal fractions, 
including the fraction 0.00. .., giving the left-hand end of the interval, 
can be enumerated. A new decimal fraction, with an integral part 
equal to zero, may be formed as follows. As the first figure after 
the decimal point we take a number different from the first figure 
of the first of the enumerated decimal fractions, as the second figure 
we take some number different from the second figure of the second 
of the enumerated decimal fractions, and so on. An infinite decimal 
fraction is obtained (we make no use of the figure 0 in forming the 
figures in the new decimal fraction), which differs from all the enu- 
merated fractions. Hence the real number corresponding to it is not 
enumerated, which contradicts the fact that all the real numbers of 
the interval [0,1] are enumerated. We have thus shown that the 
set of all the real numbers belonging to the interval [0,1] is non- 
denumerable. This set is said to have the power of a continuum. 
It may easily be seen that the set of the real numbers belonging to 
any finite interval [a, b] has the same power as the set of real numbers 
belonging to the interval [0, 1]. A one-to-one correspondence between 
the elements of these sets is established by means of the formula 
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y = (x — a){(x — 6). When z runs through the interval [a,b], the 
variable y runs through the interval [0,1]. If we use the formula 
y = tan (x2 — 2/2), when x varies inside the interval [0, 1], y runs 
through the set of all real numbers, i.e. the set of all real numbers 
also has the power of a continuum. If the ends of the interval are 
not included in the set, this does not change its power, inasmuch as 
the subtraction or addition of a finite set from or to an infinite set 
does not change the power of the infinite set. 

We shall in future write [a,b] for a closed interval and (a, 6) for 
an open interval, ie. an interval from which the ends are excluded. 
If the left-hand end is excluded and the right-hand included, we 
use the symbol (a, 6], and similarly for [a, b). The numbers a and b 
may take infinite values: a = —oo and b = + ©, i.e. the intervals 
discussed may be infinite on the left or right. For example, the closed 
interval [—°co, +°°] contains both the infinitely remote elements. 
Correspondingly, the function f(z) may be defined for 7 = —°co and 
xg = -+-°°, and we can write e.g. f(—°o). Continuity at 7 = —o is 
equivalent to the condition lim /(z) =/(— °°). Similarly for z— +. 


x—-—0co 
Furthermore, the usual notations may be used: lim f(z) = 
== f(—°co + 0) and lim f(x) = f(+0 — 0). 2-00 
q>+0co 


+ 
It is easily shown (I; 43] that a function f(x), finite and continuous 
in the closed interval [—oo, +09], is uniformly continuous in this 
interval. 


2. The Stieltjes integral and its basic properties. Let us recall the 
definition of Riemann integral, of which use has generally been 
made in the previous volumes. Let [a, b] be a finite interval and f(z) 
a bounded function, given in this interval. We subdivide the interval: 
a=%<%y<... < Up, <%= 5, choose a point & in each sub- 
interval [,—,, 7%] and form the sum of products: 


c= 2 f(Ex) (%_ — Xp_4)- (1) 


If this sum has a finite limit A for any choice of points ¢, as the 
subdivision becomes indefinitely finer, this limit is in fact called the 
integral of f(x) over the interval [a, 6]. Let 6 be the greatest of the 
differences 2, — 2%, -,. An indefinitely fine subdivision of [a,b] is 
equivalent to the fact that 6-—»0, and the existence of the finite 
limit A for the sum (1) is equivalent to the following: given any 
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positive «, there exists a positive 7 such that 


A — 2 HE) (te — 1) <E for 6 < 9. 


A more general integral can be constructed in essentially the same 
way. It was first introduced by the Dutch mathematician Stieltjes 
in 1894, in his studies on continuous fractions, then was widely devel- 
oped and applied both in pure and applied mathematics. Let /(x) 
and g(x) be two functions given in the finite interval [a,b], at every 
point of which they take finite values. Instead of the sum (1), we 
form the sum 


= PAK) [9(%%) — g(%n-1)]. (2) 


We shall call this a Riemann-Stieltjes sum. If it tends to a definite 
finite limit for any choice of points &, when the sub-division becomes 
indefinitely finer, f(x) is said to be integrable with respect to the 
function g(x) in the interval [a,b], and we write 


b n 
J f(x) dg(x) = lim = f(Ex) [g(ate) — g(atx—1)1 - 


In the Riemann integral, the role of g(z) is played by x. The new 
integral evidently has many properties similar to the Riemann integral, 
and the proofs of these properties are precisely the same as for the 
Riemann integral. We give these properties on the assumption that 
all the integrals in the formulae below exist: 


bP Pp b ) 
J aX tu fel) dg(a) = Bax J fale) dg(a); 


b p p & 
J fe) d 2 gx(x) = PAT S f(x) dg,(x); ¢(a@,— areconstants). (3) 


t) c b 
s f(x) dg(x) = A) f(x) dg(x) + F f(x) dg(z). 
We have further the obvious equation: 
b 
J dg(x) = g(b) — g(a) . (4) 


In the first and second of formulae (3), the existence of the integral 
on the left follows from the existence of the integrals on the right. 
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Let us consider the proof of the formula for integration by parts. 
Let the integral of g(x) with respect to /(x) exist; we show that the 
integral of f(z) with respect to g(z) now exists. We transform the 
sum (2) by collecting the terms containing the values of g(x) at 
coincident points: 


n-I 


= Z 924) [f(Su41) — F(Ex)] + 9) En) — g(a) F(E1) . 


On adding and subtracting the difference 
(f(a) g(x) Ja = f(b) g(6) — f(a) g(a) , 


we can write 


o = [f() g(x) ]a — {> ata ) (E43) — £ee)] + 


+ g(a) [AE1) — F(@)] + g(8) (F(6) — ie.) (5) 


The braces contain the Riemann-Stieltjes sum (2) for the integral of 
g(x) with respect to f(x). By hypothesis, the integral of g(x) with 
respect to f(x) exists, ie. the expression in the braces tends to this 
integral on indefinite subdivision of the interval. Hence, by (5), the 
sum o has a limit, ie. the integral of /(z) with respect to g(z) exists, 
and we can write the formula for integration by parts: 


b 6 
J Ha) dg() = (f(a) 918 — J g(a) afta) (6) 


or 
Fe) age y+ f oa) x) = [f(x) g(a) ]8, (7) 


where the existence of one of the integrals written implies the existence 
of the other. 

Two particular cases of the Stieltjes integral must be mentioned. 
Suppose that the interval [a, b] is subdivided into a finite number 
of parts: a=¢,<¢,< ... << Cp4< cp = 5, and that g(x) has a 
constant value g, inside each of the sub-intervals (c,—,, cx). Thus 
g(x) has a jump s% = 9x4; — gx at every point c, lying inside the 
interval [a, b]. Jumps are also possible at the ends of the interval: 
the jump s) = 9, — g(a) at the left-hand end and s, = g(b) — gp 
at the right-hand end. Suppose further that f(z) is continuous at 
all the points of subdivision c;, and at the ends of the interval. Let 
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¢, be points which are not points of subdivision, excepting possibly 
¢, and c,. In the sum (2), all the terms in which 2,_, and 2; lie inside 
the same interval (c,_,,c,) will vanish, since in this case g(z,_,) = 
= g(z,). If the interval [z,_,, z,] contains a point of discontinuity 
Cq, {(&,) will tend to f(c,) on indefinite subdivision, and g(x,) — g(a) 
to s,, and it is immediately evident that (2) gives in the limit the 
following finite sum: 


n 


lim > 1 Ex) (g(a) — g(tx—-1)] = 2 f(Cq) 8q- (8) 


If cg is @ point of subdivision of [a, 6], we have to consider both 
the intervals having c, as an end, and the result is found to be the 
same. We now take a second particular case. Let f(z) and g(x) be 
continuous in [a, 6] and let g(x) have a derivative g’(x) inside [a, 5], 
which is Riemann integrable and therefore bounded. On applying 
Lagrange’s formula to the difference g(x) — g(x,~,), we can write the 
sum (2) as 


> feo ) (9(%x) — g(%x-1) = Sib 9 (Ex) (%_e — Tp—4) 5 (9) 


where é; is an interior point of [x,_,, 2]. We can put f(é,) = f(&%) + 
+ e, where, by virtue of the uniform continuity of f(x) in [a, b], 
the greatest of the | <,| tends to zero on indefinite subdivision, i.e. 
given any positive «, there exists a positive 7 such that | e, |< e if 
6 <7. We can rewrite the sum (9) as 


= f&) [9(%x) — 9(%-1)] = 
a = F(Ek) 9 (Fk) (te — Se) + = &x9' (Eh) (%e— 1). (94) 


The product of two Riemann integrable functions is also integrable 
[I; 117], and the first term on the right of (9,) tends, on indefinitely 
finer subdivision, to the Riemann integral of f(x) g(x). It may easily 
be shown that the second term tends to zero. In fact, the function 
g(x) is bounded, as mentioned above, ie. | g’(x)|< M, where M 
is a definite positive number. As we have said, given a positive 6, 
there exists a positive 7 such that | «,| << e for 6 < », and we have 
the inequality: 


n ft 
= &4 9 (E%) (Xe — Ley) | < = eM (a, — 2,1) =eM(b—a), 


8 THE STIELTJES INTEGRAL [3 


from which it follows, since ¢ is arbitrary, that the second term on 
the right-hand side of (9,) tends to zero. We therefore have in the 
limit: 

b 


b b 
J f(x) dg(x) = a f(x) g’(x) da, (10) 


a 


i.e. given our assumptions, the Stieltjes integral reduces to an ordinary 
Riemann integral. In the previous case, it degenerated to a finite 
sum. It may easily be shown that (10) still holds if we require that 
/(z) be Riemann integrable instead of continuous. We shall consider 
later the question of the existence of the Stieltjes integral as defined 
above, and of certain more general integrals, to be defined in due 
course. An essential fact in all this will be that the function g(x) is 
assumed non-decreasing in [a, 5}. 

In future, we shall often describe a non-decreasing function as increas- 
ing. The maximum of such a function is g(b), and its minimum g(a). 
The following section is of a preparatory nature. It is of fundamental 
importance, not only for investigating the existence of the Stieltjes 
integral as defined above, but also for studying the problem of the 
existence of the more general integrals that we are to introduce 
later. 


3. Darboux sums. When discussing the Riemann integral, we 
brought in the so-called Darboux sum. Analogous sums will play a 
basic role in all the generalized integrals to be introduced below. 
We shall construct these sums in the present section and investigate 
their properties for the case of the Stieltjes integral. All the concepts 
introduced in this section, and all the facts proved, will be repeated 
with certain minor modifications in regard to future generalized 
types of integral, and we shall often refer back to the present 
results. 

Let us first of all recall the definition of the strict bounds of the 
set of real numbers [I; 39]. Let Z be a set of real numbers, and let it 
be bounded from above, i.e. there exists a number L such that all 
the numbers of the set are less than L. There now exists a definite 
number M with the following property: every number of the set & 
is not greater than M, but, given any positive e, there are numbers 
of 8 which are greater than M — e. This number /M is called the 
strict upper bound of the set %. Similarly, if the set is bounded from 
below, i.e. if all the numbers of the set are greater than some definite 
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number, the set has a strict lower bound m, which has the following 
property: every number of the set 2 is not less than m, but, given 
any positive «, there are numbers of @ which are less than m + e. 
If the set is unbounded from above, its strict upper bound is said 
to be (-+-co), and similarly, if it is unbounded from below, its strict 
lower bound is said to be (—oc). The following notation is used for 
the strict bounds: 


m=inff and M=sup@. 
Let /(z) and g(x) be functions bounded in the interval (a, 6], which 
may be finite or infinite, g(x) being a non-decreasing function, and let 
G25 Oy ey Sj Se, SO 


be a subdivision of [a,b] which we write symbolically as 6. In the 
case of an interval infinite on the left, a = —oo, and for an interval 
infinite on the right, 6 = -+-°°. Further let m, and M, be the strict 
lower and strict upper bounds of f(z) in the sub-interval [2,—,, z,)]. 
We form the following Stieltjes-Darboux sum, corresponding to the 
subdivision 6 of [a, b}: 


8 = 2 mx (gtx) — g(t); y= 2 My (9%) — g(%e-1)] (11) 


For a bounded function f(x), we have | f(z) | < ZL, where L is some 
positive number. On taking into account that g(x.) — 9({2,~,) > 0, 
given any law of subdivision 6 we have the following inequality for 
sum (11): 


|80] < 2 Lalu) — 9(2%-1)] = Llg(b) — 9(a)1, 


| S| < L[g(b) — g(a)]. 


Along with sums (11), we form the following Riemann-Stieltjes 
sum: 


y= gi MEW) (9x) — (ma) (12) 
where &, is a point of the interval [z,_,, z,]. On observing that m, < 
< fk) < My and g(x) — 9(2%.-1) > 0, we have, for any subdivision 6: 

83 <0,< 8,. (13) 


Certain new terms must be introduced. The subdivision 6’ is 
described as a continuation of subdivision 6 if all the points of sub- 


10 THE STIELTJES INTEGRAL [3 


division of 6 are also points of subdivision of 6’. Let 6, and 6, be 
any two subdivisions. We form a new subdivision by taking as the 
points of subdivision the points of 6, and 6,. This new subdivision is 
called the product of subdivisions 6, and 6, and is denoted by the 
symbol 6,6,. The subdivision 6,6, is obviously a continuation of 6, 
and of 6,. Obviously, we can also introduce the concept of the product 
of any finite number of subdivisions 6, 6,...5,. It may be further 
remarked that the sums s; and S; depend only on the choice of the 
subdivision 6, whereas the sum o, depends also on the choice of 
points &,. We shall now prove some extremely simple theorems. 

THEOREM 1. If the subdivision 6’ is a continuation of subdivision 6, 
then ss > 8, and Ss < S;. 

Let us prove say that 8; > s;. On passing from 6 to 6’, every 
sub-interval of 6 can be split into a finite number of parts: 


Gp HAO So A ee, 


and we obtain, instead of the term m,[9(x) — 9(2,-1)] of 85, the 
following sum: 


Pr 
= m Fg(x) — g(2)] , 


where m is the strict lower bound of f(z) in the sub-interval (o®,, 


a], We obviously have m > m,, so that we have, on observing 


that the difference g(x) — g(a) is non-negative: 


Pr Pk 
= m® (g(a) — g(a)] > 2 milg(xs) — g(a8,)] = 
= m[g(t.) — 9(Xx-1)], 


and the theorem is proved [cf. I, 112]. 

TuEoREM 2. If 6, and 6, are any two subdivisions, s;, < S,,. 

The inequality s; < S,, for the same subdivision 4, follows at 
once from the fact that m, < M,and g(x,) — 9(t,-,) > 0. We therefore 
have 85,5, < Ss,s, for the subdivision 6, 6,. On the other hand, by 
Theorem 1, 55, < 83,5, and S;, > Sz,4,, whence it follows that 53. < 
< S;. 

Let i denote the strict upper bound of sums s, for all possible laws 
of subdivision 6 and IJ the strict lower bound of sums 9;: 


i=sups, I =infS,. (14) 
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It follows at once from the definition of strict bounds and Theorem 2 
that s,, < i < I = &;, for any subdivisions 6, and 6,, and in particular: 


8 <i<I<S8,. (15) 


Let us find the necessary and sufficient condition for equality of 
the strict bounds 7 and J. The essential role is played here by the 


difference 


Sy — 85 = > (My — mx) (g(a) — 9(2e-1))- (16) 


k=1 


THEOREM 3. The necessary and sufficient condition for i and I to be 
equal is that there exist a sequence of subdivisions 6, (n = 1, 2, ...), 
such that 3, — 8, > 0. 

Sufficiency. If a sequence of subdivisions 5, exists for which 3, — 
— 8» 0, we obtain t=J by applying inequality (15) to this 
sequence. 

Necessity. Let i= I= A. By the definition of strict bounds, 
there exists a sequence of subdivisions 6; such that 8, — A, and a 
sequence of subdivisions 6; such that Sy, A. We take the sequence 
of subdivisions 6, = 6, 6; By Theorem 1, 8, > 8; and 8S; < Sy, 
where sy and 8; < A, and Sj, and 8; > A. All the more, therefore, 
83, A and S; — A, so that 9;,—s,;—>0, and the theorem is 
proved. It is worth noticing that the sub-intervals in the subdivisions 
én need not necessarily become indefinitely smaller. For instance, 
it may happen that all the subdivisions 6, consist of the same sub- 
division 6. The following corollary is an immediate consequence of (15): 

Corotuary. If 8, — S3,—> 0, then t= I, 85, ->t and 8, — i. The 
above necessary and sufficient condition for i = I can be stated in 
terms of the sums g;. 

THEOREM 4. The necessary and sufficient condition for the difference 
Ss, — 83, to tend to zero is that the os have a definite limit for any 
choice of points Eo”) and if this condition ts fulfilled, the limit of a5, 
is equal tot (ori=TI). 

Necessity. If Ss, — 8» 0, as we have seen, s; > ¢ and S;,—> 4, 
so that we have o, 7 for the o,, which satisfy the inequality 
$3, < 9, < S,. To prove the sufficiency, let 


Pn 
o4, = = HEP) (g(a) — gh) > A, 


where the z{") are the points of subdivision of 6, and the é” are 
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points of the intervals [z{”,, 2{). Further we write mf and Mf” 
for the strict lower and strict upper bounds of f(x) in the sub-interval 
[xf”,, af]. Let « be any given positive number. By virtue of the 
condition o; —> A, there exists an N such that 


|A—os,|<e for n>N (17) 


and any choice of points ¢{". By the definition of strict lower bound, 
we can choose points &” such that the inequalities are satisfied: 
0 < f(t) — m™ < «. We now have: 


Pa 
0 <a, — a, = 2 TEP) — mY] [olel) — o(afe)} < 


Pa 
< & el glee”) ) — g(x4™,)] = e[g(b) — gla)], (18) 


and consequently, on writing 4d —s, as A —s, =(A—o,)+ 
+ (05, — 83), we obtain, by (17) and (18): | 4 — 85 | < | A — 06, | + 
+ | 03, — 8, | < e(1 + g(b) — g(a)] for n > N, whence it follows, 
since « is arbitrary, that s; — A. It can be shown similarly that 
S;,—> A, so that 8S; — 8,0, and the theorem is proved. The 
limit A is obviously the same as the numbers 7 and J, which are equal 
in the present case. The following corollary is an immediate con- 
sequence of this and the preceding theorem: 

CoroLuaRy. The necessary and sufficient condition for i=TI its 
that a sequence of subdivisions 6, exist such that o,, has a definite limit 
for any choice of points &{". If this condition is satisfied, the limit 
mentioned is equal to i (or I =i). 

THEOREM 5. If, for a sequence of subdivisions 6,, 05, has a definite 
limit and 6), is a continuation of 5,, then Oy has the same limit. 

It follows from the conditions of the theorem and theorem 4 that 
Ss, — 83,—> 0. By Theorem 1, 8, > 8, and Sy < S;,. Consequently, 
all the more Sy — 83 —> 0, i.e. 05, — @, and the theorem is proved. 

In the case of Riemann’s integral, i.e. g(x) = x, we proved earlier 
[I; 112] that s; + 7¢ and S, TJ for any bounded function f(x) as 
the sub-intervals become indefinitely smaller. Hence 1 = I is equi- 
valent in the case of the Riemann integral to the fact that the sum 
o, has a definite limit as the sub-intervala become indefinitely smaller, 
this limit being equal to 7. This is not true in the general case. If o; 
has a definite limit as the sub-intervals become indefinitely smaller, 
it =I by virtue of the corollary to Theorem 4. But the converse does 
not hold. The condition that i = J merely implies that a sequence 
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of subdivisions 6, exists such that o, has a definite limit. We cannot 
assert that o; has a definite limit for any sequence of subdivisions 
on indefinite decrease of the sub-intervals. In the above definition of 
Stieltjes integral, we required that o, have a definite limit on indefinite 
subdivision of the interval. In later generalized types of integral 
we shall replace this requirement by the weaker requirement that 
¢ = I. In addition we shall extend the possibilities as regards sub- 
dividing the basic interval of integration, as will be explained when 
we give the new definitions. We turn in the next section to the 
Stieltjes integral, as defined in [2], and give an important sufficient 
condition for its existence. 


4, The Stieltjes integral of a continuous function. 

THEOREM 1. Jf f(x) is continuous in the finite interval (a, b], and 
g(x) ts a non-decreasing bounded function, the Stieltjes integral of f(x) 
with respect to g(x) over the interval [a, b] exists. 

On taking into account inequalities (13) and (15), we can write 


[t§— a] <8,—8, = 2 (M;, — mx) [9(#%) — g(%x-1)] - (19) 


Let ¢ be a given positive number. By virtue of the uniform con- 
tinuity of f(x), there exists in the interval [a, b] a positive number 7 
such that 0 < M, — m, < ¢« (k= 1,2, ..., n) if the greatest of the 
differences 2, — x,_, does not exceed 7. Inequality (19) now gives us 
[%— 03 | < e | 9(b) — g(a) |, 80 that o, > 7 on indefinite subdivision. 
It can be shown similarly that o, > I, so that i = JZ. This equality also 
follows at once from the corollary to theorem 4 of the previous section, 
by virtue of the fact that o,; has a definite limit as the subdivisions 
become indefinitely smaller. 

It is not vitally important for the interval of integration in a 
Stieltjes integral to be finite. We only need to explain what 
is meant by the sub-intervals becoming indefinitely smaller when an 
infinite interval is subdivided. Let us take say [—o°co, +°]. Given a 
sequence of subdivisions of this interval into a finite number of 
sub-intervals, we shall say that these latter become indefinitely 
smaller if, given any positive A, the greatest of the differences (x, — 
— 2Z,-,) tends to zero for the sub-intervals [2,_,,2,] which have 
points in common with [— A, + A]. If y(z) is continuous in the interval 
{[—co, + co] and is strictly increasing, i.e. ¢(f) > g(a) for 8 > a, the 
change of variable ¢ = y(x) transforms the interval —co < x < +00 
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into the finite interval [a,b], where a = g(—) and b = g(+~). 
A subdivision of [—c%°, +] with indefinitely smaller sub-intervals 
reduces to an ordinary subdivision of the finite interval [a,b] with 
indefinitely smaller sub-intervals. 

If, for instance, f(x) is continuous in the closed interval [—°o, +c], 
whilst g(x) is bounded and non-decreasing, the integral exists as 
before. This can be seen e.g. simply by replacing x with the new 
variable ¢ = arctan x. On putting 


f(tant)=f,¢) and  g(tant) = g(t), 


we can write the integral over the infinite interval [—°o, +>] as 
an integral over the finite interval [— 2/2, +-2/2): 


propecs 
+3 


4 co 
S fe) diay = $ A dg 
2 


where /,(¢) is continuous and g,(¢) is bounded and non-decreasing in 
[—2/2, + 2/2]. 

We must mention a practically important modification of the 
fundamental existence theorem for the Stieltjes integral: 

THEOREM 2. If f(x) is continuous and bounded inside the interval of 
integration, and the non-decreasing function g(x) is continuous at the 
ends of the interval, f(x) is integrable with respect to g(x). 

Suppose that the interval of integration is [—°o, +c]. 

Let us consider the terms on the right-hand side of (19). Since 
/(x) is bounded, we have | f(x) | < L, where L is a definite positive 
number, so that 0 < M, — m, < 2L. The terms of the sum (19) that 
correspond to the intervals [x,-,, 2] having no points in common 
with [—A, A] yield a sum not greater than 


2L[9g(— A) — g(— c0)] + 2L[g(+ oo) — g(A)]. (20) 


Since g(x) is assumed continuous, we can choose A at the points 
+c so large that (20) is less than any given positive «. We fix A 
in this way and consider the remaining terms of sum (19). The intervals 
[%,~-1, %,] corresponding to them are either wholly contained inside 
[—A, +A], or the two extreme sub-intervals fall partly outside [— A, 
+A], the length of the parts outside being not greater than , where 
n is the greatest of the differences x; — 2 —, for the sub-intervals 
having points in common with [—A,-+A]. As the sub-intervals 
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become indefinitely smaller, this number 7 tends to zero, and it will 
always be less than unity as from a certain stage in the subdivision. 
Hence all the sub-intervals [z,_,,2,] that we are now considering 
will belong, as from a certain stage in the subdivision, to the interval 
[—A — 1, 4 +1] in which f(x) is uniformly continuous. In view of 
this, we have 0 < M;, — m, < e for all sufficiently small values of 1, 
and we now have, for the terms of (19) that correspond to sub-intervals 
[Tr-1, %,] having points in common with [—A, +A}: 


0 < (My — m) [9(24) — 9(%x-1)] < ele) — Y(%n-a)]» 
and the sum of these terms will be not greater than 
e[g(4 + 1) —g(—A-—1)}. 


Finally, inequality (19) gives us 
li—o,|<ef1+9(/4+1)—g(—A-D]] < 
< e[1 + g(+ co) — g(— )], 


whence it follows, since « is arbitrary, that o,;-»> 7, and the theorem 
is proved. 

Some supplementary properties of the Stieltjes integral may be 
mentioned, when /(z) is continuous and g(x) increasing. If | f(z) | < LZ, 
we have 


b 
i) f(x) dg(x)| < L{g(b) — g(a)], (21) 


t 


which is obtained by passing to the limit in the obvious inequality 
for the sum o;. The mean value theorem obviously holds [cf. I; 92): 


b 
. f(a) dg(a) = f(£) [g(b) — g(a)] (€ in [a,d}). (21,) 


Now let a sequence of functions /,(x), continuous in [a,b], tend 
uniformly to the limit function /(z) in this interval. The latter function 
is also continuous in [a,b], and is therefore integrable with respect 
to g(x). Given any positive «, an N exists, by virtue of the uniform 
convergence of the sequence /,(z), such that | f(x) — f,(z) | < e« for 
x in [a, b}] provided n > N. We obtain on making use of (21): 


6 


J (fe) — fala)] gta) | < efg(d) — g(a)], 
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whence, since « is arbitrary: 
b b 
lim J fn(2) dg(x) = J fe) dg(e) (22) 


By using the same inequalities as when proving Theorem 2, it can 
easily be shown that (22) remains valid with the following assumptions: 
the functions f,(x) are continuous inside [a,b] and are bounded by 
the same number, i.e. | f,(z) | < Z, where the positive number L is 
the same for all n; f,(2)—> f(z) uniformly in every closed interval 
lying inside [a, b], and g(x) is continuous at the ends of [a, 5]. 


5. The improper Stieltjes integral. If f(z) is continuous inside 
[—°co, +co] and bounded, whilst g(x) is non-decreasing and continuous 
at the ends of the interval, as we have seen, the integral of f(x) with 
respect to g(x) over [—°co, +0] can be defined in the usual way, 
as the limit of the finite sums o,. Now let f(x), continuous in [—°©, 
-+-co], be unbounded, whilst g(x) is non-decreasing and bounded as 
before. Given any finite a and b, we can form the integral of f(z) 
with respect to g(x) over the interval [a, 6]. If this integral has a 
definite finite limit as a tends to (—°°) and b to (+°), this limit is 
taken as the value of the integral over the interval (—©°°, -+°°): 


+00 b 
§ fla) dg(x) = lim J f(x) dg(a) . (23) 
i pee 
If the conditions indicated at the start of this section are fulfilled, 
so that the integral over [—°°, -++°°] exists as the limit of the sum o;, 
it may easily be shown that (23) holds. 


b 
Suppose that the integrals f | f(z) | dg(z) remain bounded with any 
a 


choice of a and 6b. In this case the integral exists: 


+00 b 
S |e) |dg(a) = lim Jf | f(a) [dgia) , 


= li 
a 
b+ 

and integral (23) obviously also exists [cf. II, 82], being described as 

absolutely convergent in this case. 
We take any subdivision of the infinite interval by points 2, 

(k = ..., ~8, —2, —1, 0, 1, 2, 8, ...): 


on Wad Dag Sg ey SS ks (24) 
(lim a= —co and lim 2, = +00). 
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Let m; and M; be the least and greatest values of /(7) in the interval 
[zj-1, #;] and w; = M; — m;. We obtain by using (21,) of [4]: 


4) f(c) ) (ge) — g(a-2)]] < olga) — glea)] 
and 
X¢ q 
J He) dgte) — 2 HE) (ole) — gtma)) < 
q 
< = vate) — 9(%;-)). (25) 


Let the set of numbers w; (¢ = 0, +1, +2, ...) have a finite strict 
upper bound w = sup aj. By virtue of the continuity of f(z), we can 
construct in particular a subdivision (24) of the infinite interval in 
which @ is less than any previously assigned positive number. We 
introduce the notation: 


Siq= 2 F(E;) [g(@,) — 9(%~1) 3 
q 
Soq= s | £(E;) | Lo(%i) — g(#j-4)] . 
Further, let w; be the value of @; for | f(x) | and 
w’ = Sup o}. 


We obviously have w} < w; and w’ < w. It follows from (25) that 


ffl x) dg(a) — S,,q|< @(B— A) (26,) 
and similarly: 
x) | dg(x) — w'(B— A), (26,) 
whence it follows that 
Sigs J | fx) | g(x) + o'(B — A) (27) 


—p 


and 


S Ife) lagte) < 85,9 + 0B — A). (28) 
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We now prove a theorem which gives the necessary and sufficient 
condition for absolute convergence of integral (23). 

THEOREM. The necessary and sufficient condition for absolute con- 
vergence of integral (23) is that there exist a subdivision with finite w 
and numbers £; for it satisfying x31 < §; < x;, such that the series 

+00 


a fe) ole) — ea] (29) 
is absolutely convergent. If this condition is satisfied, series (29) is 
convergent for any subdivision (24) with finite w and any choice of &; 
from the interval [x;_1,2;], and 

+00 


or 
J fe) gle) = lim HE) (9%) — g(%-1)1- (30) 


Suppose that integral (23) is absolutely convergent. Inequality (27) 
now gives, for any subdivision with finite w: 


+c0 
Sp.q< J |fe)|dg(a) + o'(B— A), 


i.e. the sum Sj ,, which increases as p and q increase, remains bounded, 
and series (29) is therefore absolutely convergent for any subdivision 
(24\ with finite w. Furthermore, (30) follows at once from (26,). 
Now suppose conversely that series (29) is absolutely convergent for 
some subdivision (24) with finite w and for a certain choice of §&;. 
It follows at once from (28) that 


{=—co 


Xq +00 
Sif) idote) < 1 AE) [Loe — gle) + o'(B— A), 


whence it is clear that the integral on the left remains bounded as 
p and q increase, i.e. integral (23) is absolutely convergent. But now, 
as we have just seen, series (29) is absolutely convergent for any 
subdivision with finite @ and any choice of &,, and (30) holds. 

Note. If f(x) is uniformly continuous inside [—°°, +°°], and 6 
is the greatest of the differences (2; — x;_,), the condition 6— 0 
implies w-» 0, and we can write 6— 0 instead of w— 0 in (380). 
This will be the case, for instance, if f(z) = x. 


6. Jump functions. Let us carry out an elementary analysis of the 
properties of a non-decreasing function g(z). Since a monotonic 
bounded variable has a limit, the function g(x) will have a limit from 
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the left and right at every interior point of the interval [a, 6]: g(x — 0) 
and g(x + 0). There will be a limit from the right g(a + 0) at the 
left-hand end, and a limit from the left 9(b — 0) at the right-hand 
end. If g(x — 0) = g(x + 0), g(x) is continuous at the point z. 

Similarly, continuity at the ends is guaranteed by the equations 
g(a + 0) = g(a) and g(b — 0) = g(b). We have g(x + 0) > g(x — 0) 
at points of discontinuity, and the positive difference S, = g(a + 0) — 
— g(x — 0) is called the jump of g(x) at the point z. Jumps at the 
ends are similarly defined. 

A function g(x) can have an infinite set of points of discontinuity. 
Let us show that, in this case, the set of points of discontinuity of 
g(x) must be denumerable. The total increase of g(x) in the interval 
{a, b] is given by the positive number g(b) — g(a). The number of 
points of discontinuity at which the jump is greater than unity is 
therefore not greater than the integral part of the number g(6) — g(a), 
ie. there is a finite number of such points of discontinuity. Similarly, 
the number of points of discontinuity at which the jump is greater 
than 1/2 is not greater than the integral part of the number 2 [g(b) — 
— g(a)] and so on. It may now easily be shown that the number of 
points of discontinuity of g(x) can be enumerated. We first enumerate, 
in any order, the finite number of points of discontinuity at which 
the jump is greater than unity. We proceed by enumerating the 
points at which the jump is greater than 1/2, and so on. 

When integrating a continuous function, we cannot use for the 
subdivision of the interval of integration the points lying inside 
[a,b] where g(x) is discontinuous, and the values of g(x) at these 
points therefore play no part in the formation of the integral. The 
situation is different at the ends of the interval, since they are 
necessarily included in the points of subdivision. We can assume say 
that g(x) is continuous from the right at the points of discontinuity, 
ie. g(x) = g(x + 0). Let A(x) be the function g(x) thus modified, 
i.e. h(x) = g(x) at points where g(x) is continuous and at the right- 
hand end, and A(z) = g(z + 0) at points of discontinuity. Only the 
change in g(z) at the left-hand end can have an effect on the size of 
the integral, and we have the obvious formula: 


J fe) az) = $ fe) date) — f(a) [g(a +- 0) — g(a)]. 


We now split g(x) into two terms when it is discontinuous; one term 
is a continuous non-decreasing function g,(x), whilst the second g(x) 
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gives the sum of the jumps of g(x) in the interval {a, x]. This latter 
term is usually called the jump function for g(x). Its precise con- 
struction is as follows. 

Let c, (k = 1, 2,3, ...) be a finite or denumerable set of points 
in the interval [a, b]. We define increasing functions y,(x) and ;(2) 
as follows: 


— 


-{; for 2 < Cy, . for % < cy, 


t) = x 
(2) a, for «> cy, Xe B, for x > Cx, 


where a; and f, are non-negative constants such that the series 


So, and Sf; (31) 
k=l k=1 


are convergent. If a constant a, is zero, the corresponding function 
g(x) vanishes identically, and the same for y,(z) if 8, = 0. We shall 
include these functions in future formulae for the sake of sym- 
metry. If c,—=a, we shall assume that the corresponding a, 
vanishes, and if c, = 6, we assume that the corresponding A; is zero. 
It follows at once from the convergence of series (31) that the series 


oo oo 


ot) = SX pele); ve) = 2 vale), (32) 
whose terms are non-negative increasing functions, are uniformly 
convergent for all x and, in particular, in [a, 6). If x differs from cz, 
all the terms of these series are continuous at the point 7, and con- 
sequently, in view of the uniform convergence, the functions g(x) 
and w(x) are continuous at all x differing from c. At a point x = ¢ 
the term ¢,(x) has a jump from the left equal to a;, the term y,(z) has 
a jump from the right, equal to £,, and the remaining terms are 
continuous. In view of the uniform convergence, the sum of the 
remaining terms is also continuous at X= ¢x. At a point Z = ¢x, 
therefore, v(x) has a jump from the left equal to a, and is continuous 
from the right, whilst p(z) has a jump from the right equal to Bx 
and is continuous from the left. The wholeof this construction obviously 
retains its validity in the case when the set of points c, is finite. 
Now let g(x) be an increasing function and x = c, its points of 
discontinuity, whilst a, and 8; are its jumps from the left and right 
at these points, i.e. a, = g(cx) — g(cx — 0) and 8, = gle, + 0) — g(cx). 
The difference g(b) — g(a) gives the total increase of g(x) in [a,b], 
and the sum of its total jumps y, = a, + f, at the first n points 
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Cy, Ug, «++, Cn Of discontinuity is not greater than g(b) — g(a) for any n. 
Hence the infinite series consisting of the total jumps y;, of g(x) 
must be convergent. The series consisting of the jumps from the left 
a, and the jumps from the right £, must be all the more convergent. 
We form the functions g(x) and y(z) and put gg(x) = y(x) + (2). 
The quantity gg(z) is obviously equal to the sum of the jumps of 
g(x) at all points of discontinuity lying to the left of z, and the jump 
from the left at x itself if it exists, whilst the difference g4(B) — gg(a) 
is equal to the sum of the jumps at the points of discontinuity lying 
between a and f, the jump from the right at the point a and the 
jump from the left at the point £8. The difference 9(8) — g(a) gives 
the total increase of g(x) when x varies from a to f, whilst the difference 
ga(P) — ga(a) gives the increase of g(x) which is obtained by taking 
into account only the jumps at its points of discontinuity. We thus 
have the obvious inequality: 


9(B) — g(a) > galB) — gala) for B >a. 

Let g.(x) = g(2) — ga(x). If x is a point at which g(x) is continuous, 
it is a point at which g4(x) is also continuous, i.e. at which g,(z) is 
continuous. Now let x be equal to one of the cy. At this point g4(zx) 
has the same jumps as g(x) from the left and right, so that g,(z) is 
continuous at 7 = c, also. We can therefore say that g,(”) is continuous 
and increasing. We thus have the required decomposition 


g(%) = ga(X) + g,(z) . (33) 


This decomposition can be performed for any interval, closed or 
not, finite or infinite. We can write for any continuous function: 


b b b 
J tte) dgle) = J fle) dgae) + J fe) dgele).. (34) 


Let us show that the first of the integrals on the right-hand side 
can be written as the sum 


te) 
J Hee) Agalae) = & Hee) 7 (35) 


where c;, are points at which g(x) is discontinuous and y;, are the 
total jumps of g(x) at these points. We shall assume that the number 
of points of discontinuity is infinite. On putting ,{z) = 9,(z) + 
+ p(x), we can write 

Gal%) = 8y(X) + Ty(X) 5 
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where 


oo 


= 2 ele); Tr(Z) = > ala). 


k=m+1 


We have the inequality 0 < rp(z) < ymas t ymie + ..., and, in 
view of the convergence of the series composed of the y;, given any 
positive « we can fix an N such that, for any 2, 


0 <7,,(x) <e form>N. (36) 


Further, since f(x) is continuous, we have [2]: 


b 
J f(a) dorg(t) = f (cx) Yes 
so that 


b m 
a f(x) ds,_() = & flex) Vee (37) 


The function f(x) is bounded, i.e. | f(x) | < LZ, and for the terms of 
the last sum we have the inequality | f(cx) y,| < Lyx, whence it is 
clear that the series composed of the numbers /(c;) y, is absolutely 
convergent. 

By (36), we have for the integral with respect to a non-decreasing 
function 7,,(z): 


< Le (m> WN), 


6 
| § f(x) dr,,(2) 


a 


whence, since « is arbitrary, it follows that the difference 


5 b 
J fk) dgale) — J fl) d8q(z) = § Fe) ral 


tends to zero as m increases, i.e. 


b 
J fla) dgale ) = Jim § fa) dan x) 


whence, by (37), we have the formula 


b © 


{ f(a) dga(a) = 2 Ke (38) 


a k=1 


7. Physical interpretation, A physical interpretation may be given 
of the function g(x) and the Stieltjes integral. Let matter be distributed 
over the interval [a, 6], and let g(z) be the mass contained in the 
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interval [a,x], and g(a) the mass at the point x = a, if such a con- 
centrated mass is present. Otherwise, we put g(a) = 0. The difference 
g(d) — g(c) gives the mass contained in the interval (c, d]. When the 
positive number h tends to zero the interval (z, x + h] is compressed, 
and any point goes outside (x, x + h] for sufficiently small h, since 
the left-hand end is not included in the interval. The function g(x) 
is increasing (mass is positive), and, by what has been said above, 
it is natural to subject the function g(x), characterizing the mass 
distribution, to the condition g(x + h) — g(x) —> 0 or g(x) = g(a + 0), 
i.e. g(x) must be continuous from the right at all the points of discon- 
tinuity excepting z = b. There is no sense in talking of the continuity 
at the right-hand end of the interval, since the function is not defined 
for x > b. Inside the interval there are concentrated masses at the 
points where g(x) is discontinuous, and the size of the concentrated 
mass is given by the difference g(x) — g(z — 0). The same applies 
for the right-hand end of the interval. The total amount of matter 
in the interval [a, b] is equal to g(b). Everything that has been said 
is suitable either for a finite or an infinite interval. A characteristic 
feature of the above arguments is that we have made no use of the 
concept of density of the distribution. The centre of gravity of the 
distributed matter will be given by 


b 
1 
t= ab | # Ag(e) ; 


This formula is suitable for a finite interval. In the case of an 
infinite interval, the integrated function {(z) = x ceases to be bounded, 
and we have to use the definition of improper integral. 

In the theory of probability, the function g(x) usually expresses 
the probability of distribution of some random magnitude, viz g(x) 
is equal to the probability of the random magnitude belonging to 
the interval (—°°,z]. Here, as above, g(x) is continuous from the 
right. The concept of the Stieltjes integral of a continuous function 
can be extended readily, as we shall see, to the case when g(2) is 
the difference between two non-decreasing functions: g(x) = g,(x) — 
— g(x). A physical interpretation of g(x) is easily given in this case. 
Suppose that positive and negative charges are distributed in the 
interval (—°°, +°°). Now, g,(z) defines the total positive charge in 
the interval (—°o,z], and g(x) the total negative charge in this 
interval. 
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8. Functions of bounded variation. We have so far assumed that 
the integrating function g(x) is increasing. In order to pass to integrals 
with more general functions g(x), we must introduce a class of functions 
which is in fact the fundamental class to which all our integrating 
functions g(x) will have to belong. Let g(x) be a given function in the 
finite or infinite closed interval [a, b] which takes a finite value at 
every point of the interval. Let 6 be a subdivision of [a,b]: a = 
= By <<... << Mp_-y <_%, = D. We form the sum: 


ty = | g(a) ~ g(x) | (39) 


Derrnition, If the set of values of this sum is bounded for all possible 
subdivisions 5, the function g(x) is said to be of bounded variation in the 
interval [a, b}, whilst the strict upper bound of sums (39) ts called the 
total variation or simply the variation of g(x) in fa, b]. 

We shall write it symbolically as V>(g). Some simple properties of 
the sums ¢, and of the total variation must be mentioned. If we 
introduce a new point of subdivision ¢ between the points z, and 
Zp-,, it follows at once from the formula 


(Xx) ~ 9(@x—1) = Lg(%x) — gle)] + (gle) — g(%n-1)] 
that 


| 9x) — g(@x—1) | < } g(a) — gle) | + | gle) — g(%x-1) | 


i.e. the sum ¢, does not decrease on the addition of new points of 
subdivision. Further, if the sums ¢,, consisting of non-negative terms, 
remain bounded for the interval [a, 6], they will all the more be 
bounded for any interval [a, 8] making up part of [a, db], i.e. if g(x) 
is of bounded variation in [a, b], it will be of bounded variation in a 
part [a, B} of [a,b] and V2) < V4(g). 

If we take the interval [a,b] in its entirety, this is one of the 
possible subdivisions 6, and since we obviously have t; < V2g) for 
any subdivision, we must have in particular: 


| g(b) — g(a) | < Valg). (40) 


If g(x) is a monotonic function in [a, b], all the differences g(x,) — 
— 9(%,-,) have the same sign, and the sum ¢, is equal to g(b) — g(a) 
for any 6 in the case of an increasing function, and equal to g(a) — 9(b) 
for a decreasing function, i.e. any monotonic function is a function 
of bounded variation. 

We shall now state as separate theorems a number of properties of 
functions of bounded variation. 
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THEOREM 1, If g(x) is of bounded variation in [a, b], tt is bounded 
in this interval. 

Given any x of [a, 6], we can write g(x) = g(a) + [g(x) — g(a)], ie. 
| g(a) | < | g(a) | + | glx) — g(a) |, or, by (40), | g(x) | < g(a) + V2(9)< 
< g(a) + V2(g), which proves that g(x) is bounded. 

THEOREM 2. If g(x) and h(x) are of bounded variation in [a, b], then. 
cg(x) (c ts a constant) and g(x) + h(x) are also of bounded variation. 

We shall give the proof for the sum. We form ¢, for g(x) + h(a): 


ty = >A [g(an) + A(ax)] — (g(a) + Mates)] | < 


< = | 9(%%) — g(%4)|+ = | A(ary) — Alay) |. 


The last two sums are bounded, since g(x) and h(x) are of bounded 
variation by hypothesis. Hence ¢, is all the more bounded, i.e. g(x) + 
+ A(x) is of bounded variation. 

CoRoLuary. Every finite linear combination of functions of bounded 
variation, i.e. every expression of the form ¢,f,(x) + c.f,(z) +... + 
+ cyf,(z) ts also a function of bounded variation. 

THEOREM 38. If g(x) and h(x) are of bounded variation, their product 
g(x)h(x) is also of bounded variation. If, moreover, | h(x) | > m> 0, 
the quotient g(x)/h(x) ts of bounded variation. 

We consider the product, for which we form ¢,: 


t,= = | gap) A(x) — g(ty—1) A(te-1) | - (41) 


Since g(x) and A(x) are bounded, we can write | g(z)| <Z and 
| h(x) | < ZL, where L is a positive number. 
We have the obvious equation: 


(2x) A(X) — G(%e—1) (py) = G(x) [A(2~) — h(aty—y)] + 
+ h(ay—1) [9(%x) — G(en-3)] 


which, in conjunction with (41), gives us 
n n 
ty < a | ol stu) || arn) — Mates) | + | Pee) || gate) — gl ta—1) | 
or 


ty< LS h( xy) — A(2_4) | ai Ls | g(%%) — g( 2-1) fe 
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But the sums written are bounded, since g(x) and A(x) are of bounded 
variation by hypothesis, so that the sums ¢, are also bounded, which 
proves the theorem. 

THEorEM 4. If a <c <b and g(x) is of bounded variation in [a, b), 
it is of bounded variation in [a,c] and [c, b] and conversely, if it is of 
bounded variation in [a, c] and [c, 6], it ts of bounded variation in [a, b]. 
We have the formula here: 


V2(g) = Ve(g) + Vig). (42) 


We saw above that, if g(x) is of bounded variation in [a, b], it is of 
bounded variation in [a, c] and [c, b]. It remains to prove the converse 
and (42). We write ¢t, for the sum (39) for [a, b], and tf) and i) 
for the corresponding sums for [a, c] and [c, b]. If c is a point of the 
subdivision 6, 6 splits into a subdivision 6, of the interval [a, c] and a 
subdivision 6, of [c,b], and we have t,; = a + i), If g(x) is of 
bounded variation in [a,c] and [c, 6], the previous formula gives 
ts < Vé(g) + V2(g). The sums ¢; therefore remain bounded if c is a 
point of subdivision. They are all the more bounded for other sub- 
divisions, since the addition of a point of subdivision can only increase 
ty. It follows from this argument that g(x) is of bounded variation in 
[a,b] and that V2(g) < Vé(g) + V2(g). We shall now prove the 
reverse inequality, whence (42) will follow. Let < be a given positive 
number. By the definition of strict upper bound, we can choose sub- 
divisions 6, and 6, in the formula t, = ¢§? + # such that ¢? > 
> Vilg) — © and #f?) > V°(9) —e«. We now obtain: t; > V§(g) + 
+ Vg) — 22, whence V2(g) > Vi(g) + V2(g) — 22, or, since « is 
arbitrary, V°2(g) > V(g) + V2(g), which finally proves the theorem. 

CoROLLARY. We have proved the theorem for the subdivision of the 
interval [a,bjinto two parts. By applying it several times we can obtain 
a similar result for the subdivision of [a, 6] into a finite number of 
sub-intervals, i.e. if [a, b] is split into a finite number of sub-intervals 
and g(x) is of bounded variation throughout the interval, it will be of 
bounded variation in each sub-interval, and conversely; furthermore, 
the total variation over the whole interval is equal to the sum of the 
total variations in each sub-interval. This property is usually de- 
scribed as the property of additiveness of the total variation. We can 
write it in the form 


V5(g) = Vag) + Vie(g) +... V8_(g) - (43) 


8} FUNCTIONS OF BOUNDED VARIATION 27 


TororEeM 5. The necessary and sufficient condition for g(x) to be of 
bounded variation is that it is expressible as the difference between two 
increasing functions. 

The sufficiency is obvious. Increasing functions are functions of 
bounded variation, and by the corollary to Theorem 2, the difference 
between two such functions is also of bounded variation. Let us prove 
the necessity, i.e. if g(x) is of bounded variation, it is expressible as the 
difference between two increasing functions. If we put 


g(z) = Z[VH9) + 9(=)]; gle) =>LVH9)— ge], (44) 


we have 
g(%) = gy(%) — g(x) , (45) 
and it is sufficient to show that the functions g,(z) and g,(z) are 


increasing. We shall prove this for g,(z). Let a and § belong to [a, b] 
and a < f. We have 


91(8) — x(a) = = (V8(g) — Vag) + 98) — (a)}, 


or, in view of the additiveness of the total variation: 


gx(8) — x(a) = + [VEg) + g(B) — 9(a)}. 


But, by (40), V2(g) > | g(8) — g(a) |, whence it follows that g,(8) — 
— 9,(a) > 0. 

The increasing functions g,(x) and g(x) can only have a finite or 
denumerable set of points of discontinuity, and they have a limit from 
the left and right at every such point. The same can therefore be said 
of the function g(z). 

TurorEeM 6. If g(x) is continuous at a point x= c, the function 
Vi(g) = v(x) ts also continuous at this point, and conversely. If g(x) is 
continuous from the right (left), v(x) is also continuous from the right 
(left), and conversely. 

Suppose that c < b. Let us consider say continuity from the right. 
Given a positive «, we can subdivide [c, 6] in such a way: 

C= % <%... < Mn, = 5, that 


= [ g(x) — g(txy)| > V%Xg) —e. (46) 


If we add new points of subdivision, this inequality will be all the 
more satisfied. We can therefore assume that the point 2, is taken so 
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close to ¢ that | g(z,) — g(c) | < e. Use is made here of the continuity 
of g(x) from the right. Inequality (46) can be rewritten as 


| g(x.) — g(e)| + 2 { g(a.) — glax—y)| > V2(g) —e, 


whence we obtain, since | g(7,) — g(c) | < e: 


| 9(2) — g(X,_-1)| > V2(g) — 2e. 


The sum on the left is a sum ¢, for the interval [z,, 6], and it follows 
from the last inequality that 


Vi(g) > Ve(g) — 2e, 


or, since the total variation is additive, we have Vi'(g) < 2e, i.e. 
v(z,) — v(c) < 2e. The function v(z) is increasing, and it follows from 
the last inequality that v(c + 0) — v(c}) < 2, whence, since e is arbi- 
trary, we have v(c + 0) =(c), ic. o(z) = V2(g) at the point x= c of 
continuity from the right. Conversely, if we are given that v(z) is con- 
tinuous from the right, we have by (40): | g(c + h) — g(c) [|< e(e + h)— 
— v(c), and as the positive number h tends to zero the right-hand side 
tends to zero, so that the left all the more tends to zero, which proves 
that g(x) is continuous from the right at 7 = c. 

If g(x) is continuous at the point c, by what has been proved, the 
functions g,(x) and g,(x) defined by (44) are also continuous at x = c. 
This statement obviously holds for continuity from the right or left. 

THEOREM 7. If g(x) is of bounded variation, and 


g(x) = gi(x) — gF(z) (47) 
is any representation of g(x) as the difference between two increasing func- 
tions, we have the following inequalities for any a < B belonging to (a, bj: 

91(8) — gi(a) < gi(B) — gf(a); 

g2(B) — g(a) < g3(B) — g¥(a). 

We confine ourselves to the proof of the first inequality, which 
can be written as 


= (V4) + 9(8) — g(a)] < gf(6) — gXa). (49) 


We use reductio ad absurdum. Suppose that the reverse inequality 
holds: 


(48) 


4 [V2g) + 9(8) — g(a)] > g(8) — g(a). (60) 
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We choose a subdivision 6 of the interval [a, 8} such that the sum 2, 
is so close to the variation V{(g) that inequality (50) still remains in 
force when this total variation is replaced by the sum in question. 
Thus we have, for some subdivision a = % << 2%< 1... << pi << = 


= 2: 
x LS lol) — gles)| + (8) — 9(@)] > 988) — ola). (61) 


On the other hand, we can evidently write 
g(B) — g(a) = 4 (glee) — g(t1)]i 


g*(B) — gt(a) = = [gt(ax) — gt(z-1)]. 


so that inequality (51) can be written as 
1 n 
F & olen) — glares) | + gla) — g(%-1) 1 > 


> Sot (24) — g#(24-1)) - 


At least one of the terms on the left-hand side must be greater than 
the corresponding term on the right. Let this be with k = p. This 
leads us to the inequality: 


y [| g(%p) — 9(%p—1) | + 9(%p) — 9(%p-1)] > 
> GF (Lp) — gE(%p-1) - (52) 


If g(%p) — g(%p_,) < 0, this inequality is absurd, since its left-hand 
side is zero, whilst the right hand side is non-negative because gf(z) is 
increasing by hypothesis. It remains to suppose that g(z,) — g(%p—1) > 
> 0. In this case (52) can be written as 


g(Xp) = g(%p—1) > 9} (2p) a Gi (Tp-1) ’ 
or, by (47), it reduces to 
— [93(%_) — 93(%)-1)] > 0, 


which is absurd, since g3(x) is increasing by hypothesis. We have thus 
arrived at an absurdity, and inequality (49), and hence the whole of 
the theorem, is proved. 
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Expression (45) for g(x) as the difference between the two functions 
g,(x) and g,(x) given by (44) is usually called the canonical form for a 
function of bounded variation as the difference between two increasing 
functions. By the above theorem, the g,(x) and g,(z) appearing in the 
canonical form are not more rapidly increasing than the functions 
appearing in any other form. If we add the same constant c to g,(z) 
and g,(x), this obviously has no effect on their difference, nor on their 
increment in any part [a, 8] of the interval [a, b], and the form ob- 
tained for g(x) as the difference between g,(z) + ¢ and g,(z) + ¢ can 
also be described as canonical. 

Note. Each of the increasing functions g,(z) and g,(z) can be 
split into a jump function and a continuous part: 


93(%) = Qyql®@) + Yyl®);  Go(%) = Foal(X) + Goe(*) - 


This leads us to a completely determine decomposition of g(x) into a 
jump function and a continuous part: 


g(x) = [Gral%) — Goal%)] + [91e(%) — 92(%)] . (53) 


9. An integrating function of bounded variation. If f(x) is a con- 
tinuous function in [a, 6] and g(x) is of bounded variation, by using 
the expression for g(x) as the difference between two increasing func- 
tions we can write 


= FE) (g(&e) — 92-1) = PA) [9.(%x) — 91(2%-1)} — 


k= 
~ = HEu) [92(%%) — 92(2n—1)] - (54) 


The sums on the right have a definite limit as the subdivision be- 
comes indefinitely finer, so that the same can be said of the sums on 
the left, ie. a continuous function is integrable with respect to a 
function of bounded variation. Passage to the limit in (54) gives 


b b b 
J fla) dg(a) = J fle) dgy(e) — J fle) dgs(x) . (55) 
Let us indicate the changes that have to be introduced into the 


statement of the properties of the Stieltjes integral if g(x) is a function 
of bounded variation. 
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We have 


= é) [9(%~) — 9(n1)]| < LS | 9(%x) — 9( 2-1) | < LV%g), 


if | f(x) | < Z. Passage to the limit gives 
b 


J fle) dg(a)| < LVig). (56) 

This formula replaces (21) of [4]. Let us also recall the formula [2]: 
& P p b 

S fle) dD an gle) = Bae J fe) dgu(a) (57) 


If g,(z) are of bounded variation, a linear combination of them: 
@,9,(t) + ... +a) 9,(z) is also a function of bounded variation. 

We now consider the Stieltjes integral with variable upper limit 
when f(x) is continuous and g(x) is of bounded variation: 


x 
F(x) = $ ft) dg(t). (58) 
We shall prove that F(z) is a function of bounded variation. We 
form the sum 2, for [a, 2]: 


ty = f f(t) dgit) 


Xk | 


k=1 


On applying (56), we get 


< LS V8 (g) = LV 49), 
k=l 
whence our assertion follows. At the points where g(x) is continuous, 
the function V3(g) is also continuous, and an immediate consequence 
of the inequality 
x+h 


a Hx) dg(x)| < LVz+*(g) 


is that F(x) is continuous at these points. Let us also prove the following 
statement: if f(z) and g(x) are continuous in [a, b] and g(z) is of 
bounded variation, the formula holds: 


b 


b x 
J g(x) al i f(t) dg(t)] = § v(x) x) dg(x). (59) 


a 
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It is sufficient to prove this formula for the case of increasing g(z). 
We form the sum o,; for the integral on the left of (59): 


= = p(&) J f(t) dgit) , 


= Xai 


or, by the mean value theorem [4]: 


0 = 2 (Ex) FE) [9(%e) — G(ae1)]- (60) 


The points ¢, and &; belong to the same interval, and by arguing 
precisely as we did for (9) of [2], it may be seen that the sum on the 
right-hand side of (60) yields in the limit the integral on the right-hand 
side of (59), and the last formula is thereby proved. 


10. Existence of the Stieltjes integral. We bave so far considered the Stieltjes 
integral of a continuous function f(z) with respect to a function of bounded 
variation g(x). It follows from the formula for integration by parts [2] that 
the function of bounded variation g(x) is integrable with respect to the continu- 
ous function f(x). We shall give below some simple conditions for the existence 
of the Stieltjes integral in other cases. We shall assume that f(x) and g(z) are 
bounded in the finite interval [a, 6] and that g(x) is a non-decreasing function. 
Since we shall make use in future of the Stieltjes integral only in the case when 
f(z) is continuous, the results quoted below are given without proof. 

I. A necessary condition for the existence of the Stieltjes integral is that f(x) 
be continuous at every point of discontinuity of g(x), and if this condition is ful- 
filled, f(x) ts integrable with respect to the jump function g,(x) and the integral 
of f(x) with respect to gq(x) 18 given by formula (35). 

If this necessary condition for integrability is satisfied, the question of the 
integrability of f(z) with respect to g(x) reduces to the question of the integra- 
bility of f(x) with respect to the continuous non-decreasing function g,(z). 
If, for instance, f(x) is a function of bounded variation, we have integrability, 
as already indicated. The necessary and sufficient condition for integrability 
of f(x) with respect to g,(z) is as follows. 

Il. The necessary and sufficient condition for integrability of f(x) with respect 
to g,(x) is that, given any positive ©, the points of discontinuity of f(x) can be 
covered by a finite or denumerable set of intervals [a,, by] (which may overlap ) 
such that 

SY (alos) — glad] < (61) 
k 


We now suppose that g(x) is a function of bounded variation. Suppose we 
have the canonical form of this function as a difference (45) and some other like 
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form (47). We form the difference S; — 8, for g,(z) and g¥(z): 


it 
Sa— = 2) (My — my) [oil%) — 1(ta-a)] » 
A (62) 
S$ — of = 2 (Mn — my) [9ile) — 984-1 - 


We have seen above that 
lL) — Gil Zpr)< G(x) = 93 (24-1) ’ 


so that, if S; — 8; + 0 on indefinite subdivision, all the more S, — 8, 0, 
i.e. if f(z) is integrable with respect to gi(zx), it is also integrable with respect 
to 9,(z). Similarly, if f(z) is integrable with respect to g°(x), it is also integrable 
with respect to g,(2). But integrability with respect to g,(z) (& = 1, 2) does 
not imply integrability with respect to g;(x). Thus, to investigate the integ- 
rability of f(x) with respect to g(x), we have to investigate the integrability 
of f(x) with respect to g,(x) and g,(z). If f(x) is integrable with respect 
to g,(z) and g,(x), the integral of f(x) with respect to g(x) exists, and is given 
by (55). 

Ill. The integrability of f(x) with respect to g(x) and g,(x) is equivalent to the 
integrability of f(x) with respect to the total variation V%(g) = g,(x) + 92(x). 

The proof of this proposition is simple, as is that of proposition I. The proof 
of proposition II is much more difficult, 


11. Passage to the limit in the Stieltjes integral. This and the next few 
sections will give some theorems on passage to the limit under the sign of the 
Stieltjes integral. We have already had one of these theorems. It was concerned 
with the case when the integrable functions tend uniformly to the limit func- 
tion f(x). Let f,(z) be continuous in [a, b], let f,(2) + f(x) uniformly in [a, b], 
and let g(x) be of bounded variation in [a,b]. We have on the basis of [4] 
and (55): 


b b 
dim J ipl) Ag(a) = J f(x) dg(x) . (63) 


We shall indicate some simple generalizations of this statement, confining 
ourselves to an infinite interval. 

THEOREM J. Let f,(x) be continuous inside [— 0, -+ 00] and bounded by the 
same number: | f,(x)| < LD, independently of n, let f,(x) ~ f(x) uniformly in 
any finite interval and g(x) be of bounded variation in { — «0, + «], and be continuous 
at the ends of this interval, Formula (63) now holds for the interval [— 0, + 0]. 

The function f(z) is continuous and bounded inside [— oo, +], so that it 
is integrable with respect to g(x). On recalling that g(x), and consequently its 
total variation, is continuous at the ends of the interval, and that | f(x) — f,(x)|< 
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< 2L, we can say by using (56) that, given any positive ¢, there exists a positive 
A such that, for any n: 


o A 
| J ee) ~ fala) Agta) |< es | J (fe) — tala) dota) | <e. 


The passage to the limit f,(x) - f(x) is uniform in [—A, -++A], so that we 
have, by the above-mentioned theorem: 


+A 
| _J. (Ne) — fale] dg(a)| < 


for all sufficiently large n, ie. 


+ 
| S$ CHa) — f,(x)] Ag(a) | < 3e, 


whence the theorem follows, since ¢ is arbitrary. We shall now prove a similar 
theorem for the case when the f,(2) are unbounded in [— 0, + 0], and the 
integral over this interval has to be understood as an improper integral. 

THrorem 2. Let f(x) be continuous inside [— 0, -+ 0], let the improper 
integrals 

Fd B 
J fale) dg(e) = lim J f(x) dg(x) (64) 
pata 
exist uniformly with reapect to n, f,(x) + f(x) uniformly in any finite interval 
and g(x) be a function of bounded variation in any finite interval. Now, the 
(improper) integral of f(x) with respect to g(x) over the interval [— ©, + 0] 
exists, and (63) holds. 

The function f(z) is continuous in any finite interval and integrable over 
such an interval with respect to g(x). We show that it is integrable over an 
infinite interval. Let « be a given positive number. Since integrals (64) are 
uniformly convergent with respect to n, a positive A exists, such that, for 
any interval {B’, B”] lying outside [—A, +A], and any subscript n, we have 


= 
| J tal) dg(e) | <e- (65) 


We fix B’ and B” in some way so that the interval [B’, B”] lies outside 
[—A, +A]. Now, in view of the uniform convergence f,(x) -+ f(x) in the interval 
[B’, B’}, we have for all sufficiently large n: 


B’ 
| | (Mm — tnlay] date) |<. (66) 


On taking into account the obvious equation: 


B 


B B’ 
( f(x) dg(z) = J fala) dg(x) + § [H(x) — falx)] dg(a), 
B’ B B 
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we obtain by virtue of (65) and (66): 
B 
| J Hx) dg(x)| < 26, 
B 


whence it follows that the integral of f(x) with respect to g(x) over the interval 
[— ©, + 0] exists. To prove (63), it is sufficient to notice that the integral 
of the difference f(x) — f,(z) will have a sufficiently small absolute value in 
sufficiently remote sub-intervals, whilst in the finite sub-intervals it will be 
small for all sufficiently large n by virtue of the uniform convergence of f,(x) 
to f(x). 

It is worth mentioning that there is no need for uniform convergence of 
Jn(x) to f(x) for the validity of (63) in the case of a finite interval. It is sufficient 
to require that the continuous functions f,(x) tend to the continuous function 
J(z), whilst remaining bounded, independently of n, ie. there must exist a 
positive Z such that | f,(z) | < Z for any m and all 2 of [a, 6]. We shall prove 
this assertion later [50]. 


12. Helly’s theorem. We now consider a theorem on passage to the limit in the 
case when the integrating function g(x) varies. We must investigate as a pre- 
liminary the convergence of a function of bounded variation to a limit func- 
tion. Let g,(x) be a sequence of functions of bounded variation in the interval 
[@, 6], the variation of all these functions being bounded by the same number 
L, independent of n: 


ViGn) <L. (67) 


Suppose that, at every point of [a, 6), g,(x) tend to the limit function g(x) 
which has finite values. It may easily be seen that g(x) is also a function of 
bounded variation. In fact, we have the inequality for the sums ¢ for the 


Gn(z): 


im 
B= S| gn (te) — Gn (ter) | < L, 
k=1 


whence we obtain, on passing to the limit, the same inequality for the sum ¢, 
for g(x): 


ty = > 19 (a4) — 9 (te) | < L, 
k=} 


whence it follows that the variation of g(x) is also not greater than L. If the 
9n(u), instead of tending to g(x) at every point of [a,b], only tend to g(x) in 
a set & of points 2, (k = 1, 2, 3,...) dense in [a, 5], it is no longer possible 
to assert that g(x) is a function of bounded variation. We shall assume in future 
that g(z) is in fact of bounded variation in this case. It may be mentioned that 
a set & of points x, is said to be dense in [a, b] if any part of [a,b] contains 
an infinite set of points of &. Let g,(x) be a sequence of increasing functions, 
which tends to a finite-valued limit function g(x) at every point of [a, b]. 
The limit function here will also be increasing, and hence will be a function of 
bounded variation. Let us prove the following theorem: 
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THEOREM 1. If g,(x) are increasing functions in the interval [a, b], and tend 
to a function g(x) on a set of points € dense in [a, b], the convergence holds at every 
interior point of [a,b] where g(x) ts continuous. 

Let g(x) be continuous at the point x,, and let x’, 2” be points of the set ¢ 
to the left and right of x», ie. 7’ << a2, < 2”. We have g(x’) < gp(%o) < gp(x”), 
so that 

9 (%) — In (@") < Gg (%q) — Gn (®o) < (Zo) — Gn (@’)- 


We can rewrite this inequality as follows: 


C9 (%) — 9 (2")] + [9 (2) —Gn (2")] < 9 (Xo) — Gn (%) < 
< [9 (%9) — 9 (@’)] + Ug (2") — Gn (@’))- (68) 


Let « be a given positive number. The points x’ and 2” of &, which is every- 
where dense in [a, 5], can be taken so close to x, that | g(x.) — g(x”) | <« 
and | g(x.) — g(x’) | < e, since 2, is a point where g(x) is continuous. Having 
thus fixed xz’ and 2”, we have the inequalities, for all sufficiently large n: 
| g(x’) — g,(z’)| < e and | g(x”) — g,(x”) | < e, since the g,(x) tend to g(x) 
at x’ and x”. These last inequalities, in conjunction with (68), at once give us 


— 28 < g (2) — gn (Xp) < 4+ 2¢, 


whence, since « is arbitrary, it follows that g,(z,) > g(x,). We shall now state 
a fundamental theorem on passage to the limit. 

THEOREM 2 (Helly). Let f(x) be continuous in [a, 6}, gp(x) be of bounded 
variation, the variations V°(g,) being not greater than some number L independent 
of n, and g,,(x) +> g(x) at all points of [a, b]. The formula now holds: 


b b 
lim § f(x) dg, (x) = § f (x) dg (a). (69) 


As indicated above, g(x) is a function of bounded variation, so that f(z) is 
integrable with respect to g(x). We divide the interval into sub-intervals: 


2=2%4,<2%,<... <&p__. <%,_, = 6 and write the obvious formula: 
b 
§ f (x) dg (a) = > f f (x) dg (x) = 
a koul Xp, 
m Xk m Xz 
=>) V@—fepldg(e)+ > fle) § dg (a), 
kel Xpey k=} XE 
i.e. 
b m xk 
ff (e)dg(zy)= NSS UF) — Fe) dg (2) + 
Q kml Xp 
m 
+S f (xg) [9 (eu) — 9 (%p-1)]- (70) 
k=! 


Given any positive ¢, we can fix such a fine subdivision of [a, b] that | f(z) — 
— f(x) | < « for any k. This follows at once from the uniform continuity of 
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F(x) in [a, 6], We now obtain: 


XE 
I$ EF (@) — Fm) ] dg (@)| < eV (9), 


Xba 
whence 
m XE m 
| > f Fe) —F (el dg (@)| <2 DS Ve_, (0). 
: kml Xp k=1 
i.e. 


Mm XE 
| Sf UF @) — F(x) dg (@) | < & VG (9) < ed. 


k=1 xp_4 


With the subdivision fixed above, we can write (70) in the form 
6 m 
Sf (x) dg (x) = be L + 2 f (xy) (9 (@) — 9 (2e-1)] 
a = 
where | @| < 1. Similar arguments lead us to the formula 
b m 
SF (a) dgy (x) = Ope LD + o f (@ x) (9n (x) — Gn (@g-1)), 
a = 
where | 6, | < 1. Subtraction term by term gives us 
6 6 
Sf (a) dg (2). Jf (a) dgn (2) = 
a a 


= (6 — 6,) eL + 2 fe) {[9 (@4) — Gn (e)] — (9 (e-1) — Gn (@p-1)3}- 


The points x, are fixed, and, since the g,,(x) converge to g(z) at these points, 
the sum appearing in the last formula has an absolute value less than e¢ for all 
sufficiently large n. The last formula thus leads us to the inequality, for these 
mn: 


6 ; 6 
| ff (x) dg (x) — ff (w) dg, (@) |< € QL +1), 


whence (69) follows, since « is arbitrary. 

Note. Suppose that the g,(x) tend to g(x) only on a set of points dense 
in [a, 5], instead of at every point, both ends of the interval being included in 
the set; the limit function g(x) is assumed to be of bounded variation. Now, 
if we take points of the set as points 2, of subdivision of the interval, the above 
proof remains in force, and we arrive at (69) as before. We shall now mention 
some generalizations of the theorem, similar to the generalizations of [11]. 

THEOREM 3. Let f(x) be continuous inside the interval [— 0, + 0] and bounded, 
let g,(z) and g(x) be increasing functions in this interval and continuous at the 
ends, and let g,(x) -> g(x) on a set of points dense in the interval, and in particular 
at both ends. We now have the formula: 


+o + 
lim { f(x) dg, (z) = { f(#)dg (2). (71) 


lip on —~ oo — 
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It is worth remarking that, in the present case, the total variation of g,,(x) 
is given by the difference g,(-+- 0) — g,(— ©), and it follows at once from the 
conditions of the theorem that all these total variations are not greater than 
some L independent of n. The integrals of f(x) with respect to the g,(z) and g(x) 
exist. Let us find an inequality for the difference between these integrals, by 
splitting the interval of integration into three parts: [— ©, a], [a,b], [6, +], 
where the points a and b belong to the set in which g(x”) + g(x): 


+ +e 
| § f(z) dg (a) — Jf f(x) dgn (a) |< 


a a 5 6b +o +e 
<| f-— Sltif-—sitets -f |. (72) 
—o —w a a b 6 
The function f(x) is bounded, i.e. | f(x) | < LZ. We can write for the first of 
the differences: 


| Sf(x)dg(x)— § f(x) dg, (x)| < 


< L {lg (a) — 9 (— -)] + ((9n (@) — gn (— 29) ]}; 
which can be put in the form 


a a 


| S— Si <L{2[g (a) —g(— —)] + [gn (a) — 9 (a)] + 
+ [g(— 2) —gp (— 0) J}. 


Since g(x) is continuous at x = — o, we can fix an @ so close to (— ©) that 
the positive difference g(a) — g(— 0) is less than any previously assigned posi- 
tive number. Having fixed this a, we further remark that the differences 
gn(a) — g(a) and g{(— 0) — g,(— ©) are also as small as desired in absolute 
value for all sufficiently large n. A similar treatment can be given of the third 
term on the right-hand side of (72). Hence, given any positive e, we can fix 
a and b of the above-mentioned set, everywhere dense in [— ©, -++ 0], such 
that the first and third terms on the right of (72) are less than ¢ for all sufficiently 
large n. The theorem proved above is applicable to the finite interval [a, 6], 
ie. the second term on the right of (72) is also less than e for all sufficiently 
large ». Thus the left-hand side of (72) is less than 3¢ for all sufficiently large 
n, whence (71) follows, in view of the arbitrariness of «. 

We now turn to a second generalization of Theorem 2, which is concerned 
with improper Stieltjes integrals. 

TororeM 4. Let f(x) be continuous inside the interval [— oo, + 00], let g,(x) 
and g(x) be of bounded variation in thie interval, the variations of the gp(x) being 
not greater than some number independent of n. Further, let g,(x) —~ g(x) on a 
set dense in [— 00, + 0], and let the improper integrals 


+ 0 b 
§ f(w)dgn(z) = lim Sf (2x) dg, (x) 
—o a+—= @ 


b-> +o 


13] SELECTION PRINOIPLE 39 


be uniformly convergent with respect to n. The function f(x) 1s now integrable with 
respect to g(x) over [— 0, + 0], and (71) holds. 
Given any positive ¢, there exists a positive A such that 


B” 
If f (2) dgn (2) | <e (73) 
Pe 


for any interval [B’, B”] lying outside [--A, +A]. 
We fix any such interval [B’, B”] and write the obvious inequality 
B’ 


If 2) age yh < fF) don) + Lf Fe) da( (x) — — J 112) dgn (2)1- 


In view of the remark on Theorem 2, an n can be chosen so large that the 
second term on the right-hand side is less than «. It now follows at once from 


(73) that 
ge 


1 f(x) dg (x) | < 2e. 


Since « is arbitrary, this inequality shows that the integral of f(x) with 
respect to g(x) over [— 0, -+ 0] exists. Formula (71) is proved by using precisely 
the same method as above, of dividing the interval [— 0, -++ «] into three parts. 


13. Selection principle. We have already investigated a selection principle 
for sets of continuous functions [IV; 15 and 16]. We now prove a theorem which 
gives us a selection principle for functions of bounded variation. 

THEOREM (Helly). Let & be a set of functions of bounded variation in the 
interval [a,b] (finite or infinite), where a positive number L exists such that, 
for all functions g(x) belonging to &, we have the inequalities : 


lg(z)|<L; Vé(9) <Z, (74) 


i.e. all the g(x) are bounded in absolute value and thetr variations over (a, 6] are 
also bounded by some number. Now, from any infinite sequence g,(x) of functions 
belonging to the set &, we can select a subsequence gp, (x) which tends to some func- 
tion of bounded variation g(x) at every point of [a, b]. 

We only need to prove the possibility of selecting a subsequence g,,(7) which 
tends to a limit at every point of [a, b]. After this, it will follow at once from the 
conditions of the theorem, in view of what was said in [12], that the limit func- 
tion g(x) is of bounded variation. Let us prove a preliminary lemma. 

Lema. If there exists a sequence of functiona h,(x), wcreasing in [a,b] 
and bounded by the same number L, we can extract from it a subsequence of functions 
having a limit at every point of {a, b]. 

We form the denumerable set of points x, (k = 1, 2, ...), contained in [a, 6] 
and consisting of the left-hand end 2 = a and of all the points x that have rational 
abscissae. This set of points is dense in [a, b], and we can extract a subsequence 
hn,(z) which is convergent at all the points x. We thus obtain a limit function 
h(x), as yet defined only at the points a and z,. We extend it as follows to the 
remaining points of [a, b]. If x is a point of [a, 6] not belonging to the above- 
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mentioned set of points x,, we shall take h(x) to be equal to the strict upper 
bound of the values of h(x) for all the x, lying to the left of x, i.e. we put 


h(x) = sup h (a,). 
xXE< xX 


The A(x) thus constructed will obviously be increasing and bounded in [a, 6]. 
It can only have a finite or denumerable set of points of discontinuity: &,, 6, ... 
The sequence h,,,(x) is convergent to h(x) at all the points 2, of the set every- 
where dense in [a, bj. By Theorem 1 of [12], we also have convergence at all 
the points inside (a,6] where A(x) is continuous. The convergence of h,,,(x) 
to h(z) can therefore only break down at the points of discontinuity ¢, of 
h(x) and at the right hand end of the interval. On again applying the selection 
principle to the sequence h,,(z), we can arrange to have convergence at these 
points also [IV; 15], and the lemma is therefore proved. 

The fundamental Theorem 1 is now fairly easy to prove. Every function of 
bounded variation g(x) belonging to the set € can be written as a difference 
between increasing functions: 


g(a) = = (VEG) +9 (@)|— [PE —9@)]. (75) 


where, by (74), both these increasing functions have absolute values not exceed- 
ing L. We can say, on applying the lemma, that it is possible to extract from 
the sequence g,,(x) a sequence for which the minuend on the right-hand side of 
(75) tends to a limit function at all points of {a, b]. On again applying the lemma, 
we can say that it is possible to extract from the sequence obtained a subsequence 
for which the subtrahend ‘on the right-hand side of (75) also tends to a limit 
function at every point of (a, 6]. Thus we obtain a subsequence g,,(x) which 
tends to a limit function at every point of [a, b], and the theorem is proved. 


14. Space of continuous functions. We consider the set of all functions 
taking real values and continuous in a given finite terval [a, 6], and describe 
this set as a space C. An element (or vector) of this space is any function con- 
tinuous in [a, bj. Different continuous functions represent different elements of 
C. The function which is identically zero in [a, b] is called the zero element 
of C. If we form any finite linear combination of real functions continuous in 
[a, 6], with real coefficients: ¢, f,(7) + ¢,f,(x) + ... +¢mfm(x), we obtain a 
real function continuous in [a, 6], i.e. elements of C can be multiplied by real 
numbers and added to give a further element of C. This operation is subject 
to the ordinary laws of elementary algebra, e.g. 


fi(@) + fe(@) = fe (@) +h (e)s eo ffy (@) + fe (@)] = chy (@) + ef: (2); 
(Cc, + Ce) f (x) = ef (2) + ¢2 f (x); Cy (cef (x)) = (¢, Cs) f (x). 

The concept of the norm of an element may be introduced into space C; 
in other words, we introduce the idea of the length of a vector in C. The norm 
of an element f(x) is defined as the maximum value taken by | f(x) | in [a, 5]. 
The norm of the zero element is zero, and is positive for any other element. 


We shall use the notation || f || for the norm of the element f(z). Finally, the 
concept of convergence can be introduced into space C. We shall say that a 
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sequence of elements f,(x) of C is convergent to the element f(x) of C if || f(z) — 
—f,(z) || + 0. This last is equivalent to the fact that the maximum value of 
| f(z) — f,(x) | tends to zero in the interval [a, 6), which is in turn obviously 
equivalent to the fact that f,(x) > f(x) uniformly in [a, 5). 

We next introduce the concepts of functional and operator in space C. A func- 
tional in O is any definite law, in accordance with which any element f(x) of C 
is associated with a definite real number. The following notations are generally 
used for functionals: ®[f(x)], ¥[f(x)], etc. The concept of functional is a modifica- 
tion of the ordinary concept of function. In the case of a functional the role 
of argument is played by elements of OC, whilst the value of the functional is 
a real number. A functional is said to be distributive if, given any finite linear 
combination of elements: ¢,f,(z) + c.f,(v) + ... + ¢mfm(z), it satisfies the 
equation 


® [ey fy (@) + ce fe (x) +--+. + min (2)] = 
= ¢, P[f, (x)] + ce P [f2 (z)] +... Cm P [fm (x) ]. (76) 


The functional ®[f(x)] is said to be bounded if there exists a positive number 
N such that 
|P [fF (@)1] <M || f(@) II (77) 


for any element f(x) of C. 

The left-hand side of this inequality is the absolute value of a real number 
®@[f(x)], which expresses the value of the functional in C for the element f(z), 
whilst the right-hand side is the product of the positive number N with the norm 
of element f(x), i.e. the product of N with the greatest value taken by | f(z) | 
in the interval [a, 6]. Functionals that are distributive and bounded are described 
as linear. The concept of continuity of a functional may also be introduced: 
the functional ®[f(x)] is said to be continuous if the following condition is 
satisfied: if f,(”) -> f(x) uniformly in [a, 6), then O[f,(x)]— @[f(x)]. A linear 
functional may easily be seen to be continuous. For we can write, using (76) 
and (77): 


| [f (x)] — O[f, (@)) |] =| P Efe) — falz)] | < NI f(x) — fF, (2) II. 


In view of the uniform convergence of f,(x) to f(x), we have || f(x) — 
— fr(x) || + 0, so that D[f(x)] — O[f(x)] - 0, ie. in fact S[f,(x)] ~ D[f(zx)]. 
In the definition of linearity, we could have required that the functional be 
additive and continuous, then proved that it is bounded, i.e. boundedness and 
continuity are equivalent in the case of a distributive functional. We shall not 
dwell on the proof of this, since it presents no difficulty. 

An example of a functional may be mentioned. Let x, be any fixed point 
of the interval [a,b]. The values f(z,) of functions continuous at this point 
form a linear functional in C. The definite integral 


b 
\ f(x) da 


is also an example of a linear functional. Let g(x) be a function of bounded 
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variation in [a, b]. Given any element f(z) of C, we can form the Stieltjes integral 


b 
© [f (x)] = J f (x) dg (a). (78) 


It represents a linear functional ®[f(x)].It is distributive because the integral 
is distributive with respect to functions f(z), and it is bounded by virtue of the 
inequality 


< LV’ (g), 


where L is the maximum of | f(x) | in [a, 6]. Thus the role of the number N 
appearing in (77) can be played for the functional (80) by the total variation 
veg). 

Let us return to (77). If this holds for some choice of the positive number JN, 
it holds all the more for larger values of N. Let us show that there is a least 
value of N for which (77) holds. 

We remark first of all that, if || f|| = 0, f(x) is identically zero in [a, 5], 
i.e. f(x) is the zero element of C if || f || = 0. For any other element, || f || > 0, 
but the zero element of C can be written as the product f,(z) = 0 ° f(x), where 
J(x) is any continuous function. It follows from (76) that ®(f,) = 0° O(f) = 0, 
i.e. any distributive functional vanishes on the zero element of C. For the zero 
element, therefore, inequality (77) has the form 0 < N ° 0, and it holds for 
any choice of N, i.e. we can discuss (77) only when || f || > 0. On taking (76) 
into account, we can rewrite (77) as 


[4 (x) | | 
® <N (79 
(tii 
But p(x) = f(x)/|| f || is an element of C with unit norm. 
Let ng denote the strict upper bound of the non-negative numbers (9), 
where 9(x) is any element of C with unit norm: 


Horm ME lies (¢) | (80) 
It follows at once from (79) that ng is in fact the least possible value of N 
in (77). This number ng is called the norm of the functional ®(f). It is often also 
written as || @|j. 
We have 
IP(f)| <n, il fll, (81) 


but (77) can no longer hold for all f(x) of C if we take N < ng. We remark further 
that ng > 0, and if ng = 0, it follows from (81) that ®(f) = 0 for any element 
f(z) of C, i.e. the functional @(f) associates the number zero with any element 
J(x). It follows from what has been said that the norm of a functional, given 
by (78), does not exceed veg). It is worth recalling that we considered in volume 
IV the space fF’ of continuous functions, with a different definition of the norm 
of an element [IV; 35]. 
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15. General form of the functional in space C. We shall next prove Riesz’s 
important theorem, that any linear functional ®(f) in C can be written in the 
form (78), where g(x) is a function of bounded variation defining this functional. 
But, as we have seen above, every integral (78), where g(x) is a fixed function 
of bounded variation, is expressible as a linear functional in C. We can therefore 
assert, after proving Riesz’s theorem, that integral (78), where g(x) is a function 
of bounded variation, represents the general form of linear functional in C. 

THEoREM (Ff. Riesz). Every linear functional in space C can be written in 
the form (78), where g(x) 18 a function of bounded variation. 

We shall use polynomials of a special type in the proof of this theorem; 
these polynomials were first introduced by Bernshtein, and have already been 
mentioned. Let us recall the construction and basic property of the polynomials. 
Let f(x) be a continuous function in the interval [0,1]. The Bernshtein poly- 
nomial corresponding to this function has the form 


n(n—1)...(n—m+1 
m\ 


Pde) S1( Hora a arm (on - ). (82) 


As we proved earlier [IT, 154], on indefinite increase of n the sequence of 
polynomials P,,(x) tends uniformly in the interval [0,1] to the function f(z). 
To prove our theorem, we transform the interval [a, 6] into the interval [0,1] 
with the aid of the linear change of the independent variable: y = (~ — a)/ 
(6 — a). The space of functions continuous in [a,b] now becomes the space 
of functions continuous in [0,1], and we shall assume in the proof that the basic 
interval (a, 6] is already [0,1]. Let ®[f(x)} be a functional in space C. We have 
to show that it can be written in the form (78), where g(x) is a function of bounded 
variation in [0,1]. We have the obvious equation 


n 
> One (l—~a)?-™— 1, 
m=0 


and all the terms of the sum written are non-negative if x belongs to [0,1]. 
Hence it follows that, if ¢,, is a number equal to +1 or —1, the inequality 
holds: 


n 
| Senn e™(1—a2)""|<1 (O<2<1). (83) 
m=O 


On applying the functional ®{[f(x)] to the polynomial on the left-hand side 
of (83), we obtain, by (77) and (83): 


n 
Len Cea Ae FI <n. (84) 


We now choose the signs of the ¢,, so that the products ¢,0[C, "2"(1 — 
— x)"~™] are non-negative for any m. With this choice of ¢,,, (84) can be writ- 
ten as 


a3 
> | [Cn 2” (1 —2)"-"] | <4. (85) 


m=0 
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We subdivide [0,1] into 7 equal parts and define a function g,(x) such that 
it remains constant in each sub-interval, i.e. we define the function as follows: 


Yn (9) = 0 
Jn (x) = [CA 2? (1 — x)"~9} for 0<e<—, 


Gn(x) = B [Cpa (1—2x)"~°] + S[Ona (1 —n)"*] 


1 2 
for —<a<—, 
n n 
2 mom =i 2 3 (86) 
in (a) = 2 P[Cne (1 — 2a) ] for er 
ne n—1 
9n (@) = ee @ [C7 x™ (1—-2)"—™] for a Se <l, 
m=0 
n 
In (1) = > @[Ona™ (1 — 2)" ™]. 
m=0 J 


The total variation of g,(z) is evidently equal to the sum of the absolute 
values of the jumps of g,(x) at the points of subdivision and at the ends of 
the interval. By (85), we have V2(g,) < ng. Similarly, it follows at once from 
(85) and the definition of the function g,(xz) that | g,(x) |< Np: Theorem 1 
of [13] can therefore be applied to the sequence of g,(x), and we can say that 
a sequence of increasing positive integers n, exists such that g,,(x) tends to 
a function g(a) of bounded variation at every point of [0,1]. We next show 
that this function g(x) in fact appears in the right-hand side of (78). We form 
the Stieltjes integral of f(x) with respect to g,,(x). It is equal to the sum of the 
products of the values of f(x) at the points of discontinuity of g,(x) with the 
jumps at these points, i.e. [6]: 


* . m mim n—-m 
{ fedonte) = S1(F) elena a — a)". 


By (76) and (82), the right-hand side is the value of the functional D[f(z)]} 
for f(z) = P,(x), ie. 


l 
§ #(@) dgn (w) = ® [Py (2)]. 
0 
We apply this formula for n = n,: 
1 
St (x) dg, (2) = O[P,,, (a)]- (87) 
0 


When n, increases indefinitely, P,,(x) > f(x) uniformly in [0,1] and, since 
the functional ®[f(x)] is continuous, (87) gives 


1 
D[f(x)] = lim § f (w) dg, (@). 


nypweo 0 
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We can apply Theorem 1 of [12] to the right-hand side, and we thus arrive 
at the formula 


1 
© [f (x)] = J f (x) dg (x). (88) 


We show that ng = Vi(g). It follows at once from (88) that, as we have seen 
in [14], ng < Vi(g). On the other hand, it follows from the inequality V}(gn,) < 
< 7% given above, that V1(g) < ng also. These two inequalities in fact lead to the 
equation ng =Vi(g). Let us consider whether form (88) for the functional 
(f) is unique. Let h(x) be a function of bounded variation, the values of which 
differ from the values of g(z) on some set ¢ of points of [a,b], where ¢ is a 
finite or denumerable set. It may readily be seen that 


1 
§ f (a) dh (2) (89) 
0 


is equal to integral (88) for any choice of continuous function f(x). For, the set 
of points of any interval contained in [0, 1] has the power of a continuum, so 
that points not belonging to ¢ form a set dense in [0, 1]. Therefore, when form- 
ing the Riemann-Stieltjes sum for integral (88): 


> t(&) [9 (xy) — 9 (tx-1)] 
k=1 


we can choose points not belonging to € as the points of subdivision when the 
sub-intervals become indefinitely smaller, i.e. integrals (88) and (89) must be 
the same. 

Thus, on varying the function g(x) on a finite or denumerable set of points, 
but in such a way that the new function h(z) is of bounded variation, we obtain 
integral (89), yielding the same functional in C as integral (88). By what was 
said in [14], we can assert that ng < V1(h), where the < sign can in fact hold. 
We had ng = V%(g) for the g(x) constructed by the method used in proving the 
theorem. 

We now pose the following general problem: for what functions of bounded 
variation h(x) does integral (89) define the same linear functional in C as integral 
(88)? 

On introducing the new function of bounded variation w(x) = h(a) — g(x), 
we arrive at the following problem: for what functions of bounded variation 
w(x) have we 


1 
ff (x) dw (x) =0 (90) 
0 


for any function f(z) continuous in [0,1]? The next theorem supplies the answer. 

THEOREM. The necessary and sufficient condition for (90) to hold with any 
choice of continuous function f(x) is that the function of bounded variation w(x) 
satisfy the following: (1) w(x.) = w(0) at every interior point x = x, of [0, 1] 
at which w(x) ts continuous; (2) w({1) = w(0). 
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Necessity. We choose so large an n that x, > 1/n and the point x, + I/n 
lies inside [0, 1], and we define a continuous f(x) as follows: 
] for O<2<%, 


f(2) = —nz+(1+nz,) for my <B< a+, 


1 
0 for T+ <<. 


In the central interval [25, x) -+- 1/n], f(z) is a linear function, decreasing from 
unity to zero. With such an f(x), (90) gives 


Xe ati 
$de(x)+ § [—na + (1+ nay)] do (x) = 0 
0 . 

or : 

xi 
@ (%) — w (0) + 5 [—na+(1 + nzy)] dw (x) = 0. (91) 
Xe 
But we have [9]: 
Xo + = 
n 1 
| § [ne + (+ nz,)] da (2) | < 1:08" a (wo) 
Xe . 


Since w(x) is continuous at « = x, by hypothesis, we can say that V39t¥/"(w) — 
— 0 on indefinite increase of n, and (91) gives in the limit w(x,) = w(0). The 
second condition w(1) = w(0) is obtained from (90) if we put f(z) = 1. 

Sufficiency. Since the points at which a function of bounded variation is 
continuous are distributed densely in [0,1], we can make use of these points 
only in forming the Riemann-Stieltjes sum. But now, in view of the conditions 
(2%) = w(0) = w(1), all the differences w(x;,) ~ w(x,_,) vanish, so that (90) 
holds for any choice of continuous f(z). The theorem is proved. 

We have thus shown that the necessary and sufficient condition for integral 
(89) to give the same linear functional in C as (88) is that the difference 
h(x) — g(x) be equal to h(0) — g(0) at every point at which it is continuous, as 
also at x = I. 

Let us return to (88). As we know, ng < Vi(h) for any form (88) of P(f). 
We have ng = Vi(g) in (88). Further, a constant can evidently be added to 
g(x). We shall eliminate this many-valuedness by assuming g(0) = 0 in (88). 
The function g(x) can only have a finite or denumerable set of points of dis- 
continuity. Let us consider a point of discontinuity £ such that g(§ — 0) = 
=g(é + 0), but g(é) # g(f — 0) (a removable discontinuity). If we change 
g(x) at the single point x = é by putting g(¢) = g(§ — 0), we remove this 
discontinuity without changing integral (88), though with a decrease in V3(9)- 
But this is an impossibility, since we has earlier Vi(g) = g, whilst after the 
change described we should have the impossible Vi(g) < ng. Hence it follows 
from Vi(g) = ng that g(x) has no removable discontinuities. There remain the 
discontinuities at which g(é — 0) # g(& + 0). If we take g(é), first as belonging 
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to the closed interval ¢ with ends g(& — 0), g(é + 0), then as lying outside this 
interval, V1(g) will obviously be greater in the second case than in the first. 
Hence it follows from Vi(g) = ng that, at irremovable discontinuities, g(é) 
belongs to the closed interval 7. The position of the number g(¢) in the interval 
+ has no effect on the total variation V2(g). Thus, given V1(g) = ng, the defini- 
tion of g(¢) at the points of discontinuity is the only arbitrary feature, though 
g(&) must belong to 7, We usually put 


_ g(&—0) +9(E +0) 
g(f) = #6 ee" 


and either g(é) = g(& + 0) is a continuity from the right, or g(¢) = g(é — 0) 
is a continuity from the left. 

A similar treatment to the above can be given of the space C of complex 
functions ¢(x) + tp(x), continuous in the interval [0,1]. The elements of this 
space can be multiplied by complex numbers and added. The definitions of 
norm and linear functional are retained, but the values of a functional may be 
real or complex. The theorem on the general form of the linear functional holds, 
where complex functions of bounded variation have the form A(x) + t4(x), 
where A(x) and s(x) are real functions of bounded variation. 


16. Linear operators in C. An operator in C is any definite law in accordance 
with which any element f(x) of C is associated with a definite element 
p(x), also of C. 

We bring in the notation F[f(x)] for the operator. Given a function f(x) 
of C, the symbol F[f(x)] defines some function 9(x) also of C. The definition of 
a distributive operator is like the definition of distributive functional, i.e. 
a formula analogous to (89) is used. A bounded operator is defined by a formula 
similar to (88), except that the absolute value on the left-hand side must be 
replaced by the norm, since F[f(x)}] is an element of C and not a number: 


| F Cf (#)] || < 4 II f (2) |L- (88,) 


A distributive and bounded operator is said to be linear. Such an operator 
is necessarily continuous, ie. if f,(~) > f(x), then F[f,(x)]—~ F[f(x)], the con- 
vergence in both cases being the uniform convergence of the corresponding 
sequence of functions in [a, b]. 

We shall state without proof the fundamental result on the general form of 
a linear operator in C. Let g(x, y) be a function defined in the closed two- 
dimensional interval 0 < x < 1, 0 < y < 1, and of bounded variation with 
respect to x in the interval [0,1] for any value of y in the interval [0,1]. On 
substituting this function g(x, y) in the right-hand side of (96), instead of a 
number we obtain a function of the parameter y, defined in the interval [0,1]: 


1 
 (y) = § f(x) dg (a, y). 
0 


The necessary and sufficient condition for this last formula to yield a linear 
operator is that g(x,y) has, in addition to the above properties, the further 
property that the function p(y) defined by the last formula be continuous in 
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{0, 1] for any choice of f(x), continuous in [0,1]. If y, is a value from [0,1], 
and y, (n = 1, 2,3, ...) is a sequence of numbers from [0,1] having y, as 
its limit, the definition of continuity for p(y) at y, leads at once to the following 
necessary condition that must be satisfied by g(x,y): given any y, of [0,1] 
and any f(z) continuous in this interval, we must have 


1 1 
lim Jf (a) dyg (x, yn) = Sf (x) dxg (2, Yo)s (92) 
Ya>Yo 0 0 


where y, is any sequence of numbers of [0, 1] having y, as its limit. A function 
g(x, y) that has these properties is usually said to be weakly continuous with 
respect to the parameter y. If g(x, y) is of bounded variation with respect to 
zx and weakly continuous with respect toy, (92) obviously yields a linear operator 
in CO. The converse can be proved, i.e. every linear oeprator in C is expressible 
by (92), where g(x, y) is of bounded variation with respect to x and weakly 
continuous with respect to y. The proof of this theorem and a discussion of the 
concept of weak continuity may be found in V. I. Glivenko’s book The Stieltjes 
Integral. 

If K(x, y) is a function continuous in the two-dimensional interval 0 < 
<2<1,0< y< 1, the formula 


1 
y(y) =SK (y, x) f (x) dx 
Ct] 


evidently gives a linear operator in C. We encountered these operators in the 
theory of integral equations. However, not every operator in C can be written 
in this form. 


17, Functions of an interval. In future generalized concepts of the 
integral it will be more convenient for us to utilize functions of an 
interval instead of functions of a point. Let g(x) be a given non- 
decreasing bounded function on the infinite axis (— oo, +). We 
associate with any interval 4 = (a, 8] semi-open from the left a non- 
negative number: g(f + 0) — g(a + 0) (the mass contained in this 
interval). Hence we obtain a function of a semi-open interval, which 
we shall write as G(d): 


G (4) =9(8 + 0) —g(a+ 0). (93) 


To formulate the properties of this function we must introduce a 
new concept. We shall say that a sequence 4, A, ... of semi-open 
intervals is vanishing if every interval 4“t” belongs to the previous 
interval A, and there is no point common to all the intervals. Let us 
explain the structure of a vanishing sequence of intervals. Let A be 
(ax, b,]. By hypothesis, a, < p41, On < bey, and b, — a,—> 0 on in- 
definite increase of k. The monotonic sequences a; and b;, have a com- 
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mon limit c, where a, < c < 0; for any k. Since there is no common 
point (in particular, c) of all the intervals, the point c must fall outside 
the left-hand open end of the interval for all sufficiently large k, i.e. 
A” is (c, by] for all sufficiently large k and b, > c. Now: 


G (A) = 9 (bx +0) —g (c +0) +0, 


since g(b, + 0) > g(c + 0). Definition (93) and the arguments just 
adduced lead at once to the following three fundamental properties 
of G(4): 

(1) G(A) is non-negative 

(2) it is additive, i.e. if the semi-open interval 4 is split into a finite 


number of semi-open intervals 4,, 4,,..., 4;, no pair of which has 
common points, this being usually expressed by the equation 
A=A,+A,+..-+4Ay (94) 
then 
q 
G(4) = = G (Ay); (95) 


(3) the function G(A) tends to zero in a vanishing sequence of inter- 
vals. 

This last will be described as the property of normality of G(4). 
It has an obvious physical meaning. We shall discuss any intervals: 
(a, Bj, (a, 6), [a, 8], (a, 8), and not merely those semi-open on the left: 
also, a separate point a will be regarded as the interval [a,a]. By 
starting from a non-decreasing bounded function of a point g(x), we 
can construct a function G(A) of any type of interval, where the func- 
tion possesses the three basic properties given above. All we have to do 
is introduce, in addition to (93), the following further definitions: 


G((a, B)) = g (B — 0) — g(a — 0); 
G ([a, B]) = g (8 + 0) — g(a — 0); (96) 
G((a,8)) = 9 (8 — 0) -—g(a+9), 
or, if [a] is an interval consisting of a single point, then 
G ([a]) =g (a+ 0) —g(a—0). (97) 


If the non-decreasing function g(x) is defined only in a finite interval, 
say in(a,b], we can extend it to the entire axis by putting g(z)—9g(a+ 0) 
for « < a and g(x) = g(b) for x > b. 

We have arrived at the concept of a function of an interval G(4) 
by starting out from a non-decreasing function of a point g(x). Con- 
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versely, given an interval function with the above three properties, 
we can form a point function g(x), which leads to G(A) in accordance 
with the above scheme. For this, it is sufficient to put 


g (x) = G ([ — 0, x). (98) 


The function g(x) thus formed is obviously continuous from the 
right. If G(A) is defined only for intervals belonging to some interval 
Ay, it can be defined for all intervals by putting G(4) = G(4- A,), 
where the product of intervals 4 - 4, is defined as the interval con- 
sisting of points that belong simultaneously to A and A). If there are 
no such points, i.e. 4: 4, is the empty set, we naturally put G(4- 4,) = 
= 0. Notice that, if a constant is added to g(x), this has no effect on 
the value of G(4). If g(x) is a function of bounded variation, we make 
use of its canonical form as the difference between two decreasing 
functions: g(x) = 9,(x) — g,(a). The functions g,(z) and g,(z) lead us to 
interval functions G,(4) and G,(4) with the three properties men- 
tioned, whilst the function g(x) gives us the function G(A)=G,(4)—G,(A). 
This function G(A) can be formed directly from 9(x) in accordance with 
(93), (96) and (97). It is additive and normal, but may be negative. 


If we had the closed interval [— «0, + 0], we should have to put, instead of 
(98): 
g(x) = G((— »)) + G ([— », 2). (99) 


18. The general Stieltjes integral. We now turn to a generalized type of 
Stieltjes integral. As already remarked in [3], we obtain such a generalization 
if we require only that the numbers ¢ and I defined in [3] be coincident. Further, 
we shall consider integration over an interval A of any type, and shall split 4 
into sub-intervals 4, of any type, having no common points, either inside or 
at the ends. Naturally, an individual point is considered admissible as a sub- 
interval. 

Suppose, then, that we have bounded functions f(x) and g(x) given on a finite 
or infinite interval A, where g(x) is non-decreasing. We divide 4 into sub- 
intervals 4, (k = 1, 2,...,p) of any type with no common points. Let m, 
and M; be the strict lower and strict upper bounds of f(z) in 4, and let &, 
be any point of 4,. Together with g(x), we bring in the interval function G(4,), 
defined in [17], and we form the sums: 


Pp p p 
Oy = > 4G (Ay); Sp= JS MiG (Ay); 0 = > Ff (Gi) G (Ay) (100) 
k=1 k=1 k=1 


Some facts must be mentioned in connection with our use of intervals of 
any type. If a point P with abscissa x comes in as an independent element 
of the subdivision of A, the corresponding term in each of sums (100) is the same 
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and has the form f(x)[g(z + 0) — g(z — 0)). There is no sense in bringing in 
points where g(x) is continuous as independent elements of the subdivision, and 
it can be assumed in future that, if a point is an independent element of the 
subdivision, it is a point of discontinuity of g(x). 

If A’ and 4” are two intervals, their product 4’ A” is defined as the set of points 
which belong simultaneously to 4’ and A”. This is again an interval, or is the 
empty set. Let 6’ and 6” be two subdivisions of J. The product of subdivisions 
6’ and 6” is defined as the subdivision 6’6” consisting of all possible intervals 
A’A”, where 4’ belongs to 6’ and 4” belongs to 6”. Pairs of the 4’4” obviously 
have no common points, whilst their sum gives the basic interval 4. The sub- 
division 6’ is called an extension of the subdivision 6 if every element of 6’ lies 
wholly in one of the elements of the subdivision 6. The product 6’6” is an exten- 
sion both of 6’ and of 6”. If the interval 4 is closed on the right and x = 6 
is the right-hand end of A, we have to assume g(b +- 0) = g(b) in the defini- 
tion of G(A). Similarly, at the left-hand end g(a — 0) = g(a). This also refers 
to the case when 6 = + © ora = — o. As in [3], we write 2 for the strict upper 
bound of sums s, and J for the strict Jower bound of sums S, for all possible 
laws of subdivision. Everything that we said in [3] holds for s,, S,, 04,7 and I. 

DEFINITION. We shall say that f(x) 1s integrable with respect to g(x) (or 
G(A)) if i =I, and we take ¢ as the value of the integral: 


i= (f(x) dg (a) = § f(x) G (dd). (101) 
4 4 


The integral thus defined is called the general Stieltjes integral. 

The integral defined in [2] will be called simply the Stieltjes integral, or the 
original Stieltjes integral, in order to distinguish it from the general integral. 
We now give the conditions for the existence and the properties of the general 
integral. 

THEOREM 1. The necessary and sufficient condition for the existence of integral 
(101) 2a that there exist a sequence of subdivisions 5, (n = 1, 2, ...) such that the 
difference Ss, — 85, tends to zero, or, what amounts to the same thing, 05, has 
a definite limit for any choice of points &), This limit A in fact gives the value 
of the integral. Now, 3;,—+ A and S;,-> A. 

This theorem follows directly from [3], the sub-intervals 4, having no com- 
mon points in the present case. It should be mentioned that the sequence of 
subdivisions 6, mentioned in the theorem need not necessarily be a sequence 
with the sub-intervals becoming indefinitely smaller. If, for instance, there 
exists a subdivision 6 such that o, does not depend on the choice of points ¢;, 
we can take all the 6, as coinciding with 6. 

Der¥inition. A sequence of subdivisions 6,, is said to be regular for the function 
g(x) (or G( A) ) if the following two conditions are fulfilled: (1) every point of disconti- 
nuity of g(x) appears as an independent element of subdivision for all 6, as from 
a certain n; (2) the sub-intervals of 6, become indefinitely smaller on increase of 
n, this indefinite refinement having the meaning indicated in [4] in the case of 
an infinite interval A. 

THEOREM 2. If 6, is a regular sequence of subdivisions, 8, 7 and Sz, > I 
for any choice of bounded function f(x). 
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Let 6 be any given subdivision of A and 6, = 66,. We shall take n so large 
that, firstly, all the points of discontinuity of g(z) which are points of the sub- 
division 6 have already appeared as independent elements of subdivision 6,, 
and secondly, every sub-interval of 4, contains not more than one point of the 
subdivision 6. It has to be borne in mind in future that the number of such 
points is fixed. Let us consider the difference 6 — 83. The terms in the sums 
83, and 8, that correspond to the points of 6 which are points of discontinuity 
of g(x) are the same, since, by what has been said above, they are independent 
elements of subdivision in 6, and hence in 6%. In future, we shall only speak 
of the points of 6 at which g(x) is continuous. Let the number of these be gq. 
If a sub-interval of subdivision 6, does not contain such a point of subdivision 
in its interior, the terms in 6,, and sy, corresponding to it are the same. There 
remain not more than q sub-intervals of 6, which contain these points of 
6 as interior points. Let x = 2’ be a point of 6 lying inside the sub-interval 
A) of 5,. On passing from 6, to 6;, a term of the form A” a( Am) is replaced 
by two terms of the form ft) GA) + vo GCA), where A” contains x = 2’ 
as an interior point, and A") and A) contain it at the ends (precisely where 
x’ is reckoned to be is of no importance, since g(x) is continuous at x = x’). 
The A, », ») denote numbers whose absolute values do not exceed the 
strict upper bound of | f(z) | in A. 

Since the sub-intervals become indefinitely finer as n increases in a regular 
sequence, this is true for A, A), AM, and since these intervals contain 
a fixed point « = x’, at which g(x) is continuous, either as an interior point 
or as an end, we can assert that 


LOG (4) +0 and pa (4) + 9% G (40) +0 when noo. 


On recalling also that the number of points x’ does not exceed g, we can say 
that 95, — 8, + 0 as n+ o. 

On the other hand, we have the inequalities 3, < sy <i and 8; < 8,. < i. 
We can choose the subdivision 6 such that 8, is as close as desired to 7, i.e. 
given any positive e, we can choose a d such that 7 — 8; < e. It now follows at 
once from 8, < 85, < 7 that? — 3 < e, and finally, the result obtained above: 
85, — 83, -» 0 shows us that, for all sufficiently large n, we have ¢ — 83, < 2e, 
i.e. since ¢ is arbitrary, 8, — ¢, which is what we wanted to prove. It can be 
shown in precisely the same way that S; —+ I. Notice that, if g(x) is con- 
tinuous, the only characteristic feature of a regular sequence is that its sub- 
intervals become indefinitely finer. This is the case, for instance, with the 
Riemann integral, where g(x) = x. Since x is unbounded in an infinite interval, 
when forming the proper Riemann integral we were forced to consider only 
bounded intervals. 

THEorEM 3. If integral (101) exists and is equal to A, 05, — A for any regular 
sequence of subdivisions. It follows at once from the hypothesis that i = J = A. 
With this, 3; -+ A and S, > A by virtue of the theorem, so that o3,, which 
satisfies s5, < 0, < Ss, all the more tends to A for any choice of points gv), 

The following corollary is a direct consequence of the above theorems. 
If the limit 0, + A exists for some regular sequence 6, tt likewise exisis for any 
other regular sequence and is also equal to A. The integral now exists, and is equal 
to A. If 05, has no definite limit for some regular sequence of 6,, integral (101) 
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does not exist. Hence, the use of sums g,; with regular sequences of subdivisions 
directly solves the problem of the existence of the basic Stieltjes integral. 

A further remerk: if 6, is a regular sequence for g(x) and 6, is an extension 
of 6,, then 6, is also a regular sequence. This follows at once from the definition 
of regular sequence. 


19. Properties of the (general) Stieltjes integral. As we have seen, the general 
Stieltjes integral can be obtained as the limit of sums o, for some choice of 
sequence of subdivisions. Now, if o;, have a definite limit, and 6, is an extension 
of 6,, the o3, also have the same limit. We can therefore prove the properties 
of the general Stieltjes integral by using the sums g;, just as we did for the 
Riemann integral and the original Stieltjes integral. We shall give these pro- 
perties with some corollaries. Throughout what follows, the functions f(x) 
and g(x) are assumed bounded in the interval of integration, and in addition, 
g(x) is assumed non-decreasing. 


J. If c, are constants, we have 


§ dS cehel) x) dg (x) = See f fel (102) 
4 k=l 
where the existence of the integrals on the right implies the existence 
of the integrals on the left. 


This is proved simply by taking any regular sequence of 6,, for the function 
g(x). 


Il. If g,(x) (k = 1, 2, ..., p) are non-decreasing bounded functions 
and c, are positive constants, we have 


$ Fla) d (Sogn (x = See [He 2) gy (a (103) 


where the existence of the integrals on the right implies the existence 
of the integrals on the left, and vice versa. 


Let 6 be a regular sequence of subdivisions for g;,(z). 
The sequence 6, = 61) 62) ... d{?) will be a regular sequence for all the 
g(xz) (ke = 1, 2, ..., 9). We have the obvious equation 


P 
Ss, — 8, = = Ch (sf — afi) ; (104) 


where 8, and S;, refer to the integral on the left-hand side of (103), whilst 
8,(k) and Ss (k) refer to the integrals on the right-hand side. If the integrals 
on the right-hand side of (103) exist, S;(k) — 83 (k)~ 0 (k = 1,2, ...,p), 
and consequently S3, — s;,— 0, ie. the integral on the left-hand side exists. 
Now suppose that this latter integral exists. There must now exist a sequence 
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of subdivisions 6, such that S,, — @;,-+ 0. The terms on the right-hand side 
of (104) are non-negative, so that S,,(k) — 8,,(k) + 0 for k= 1,2, ..., p, 
i.e. the integrals on the right-hand side of (103) exist. Formula (103) itself 
follows at once from the analogous property for the finite sums o,, and a passage 
to the limit. 


IIT. If the interval 4, is divided into a finite number of intervals 


A,, 4;,..., 4m, with no common points, then 
ff) dg a) = S f f(a) dale (105) 
4 =l 4 


where the existence of the integrals on the right implies the existence 
of the integrals on the left, and vice versa. 


Suppose that the integral on the left exists. By Theorem 1, a sequence 6, 
of subdivisions of 4 exists such that S,, — ,,-+ 0. Let 6 denote the subdivision 
of A into 4, (k = 1, 2,...,m) and let 6, = 6,6. We obviously have S,. — 
8&3 + 0, since 84, > 8,, and S;, < S,,. The sum of the form (16) that represents 
S53, — 83, can be split into m non-negative terms, each of which represents 
the corresponding sum for some 4,, and, since all the sums tend to zero, we can 
say that this is true for the individual sums, i.e. the integrals on the right of 
(105) exist. Conversely, let the integrals on the right exist. For each of them, 
there exists a sequence of subdivisions 6) with which the difference Sj — 
— 8,» tends to zero. The product of these sequences of subdivisions leads at 
once to a sequence of subdivisions of the entire interval 4 with which the 
corresponding difference for the integral on the left also tends to zero. Notice 
that, for the original Stieltjes integral, in the third of formulae (3), the existence 
of the integrals on the right does not imply the existence of the integral on the 
left. 


IV. If we have | f(z) | < LZ in the interval A, then 
| pF) dg (2) | < LG (A), (106) 


where G(4) is the total increment of the function g(z) in the interval 4, 
on the assumption that the integral exists. 

V. If the functions f,(z) tend uniformly to /(z) in the interval 4 as 
p — ©, and the integrals of f(z) with respect to g(z) exist, the integral 
of {(x) with respect to g(x) also exists, and we have 


lim f{ f, (#) dg (x) = Jf (107) 


pe da 


Let AY (k = 1, 2, ...,t,) be the intervals of subdivision of some regular 
sequence of subdivisions 6, for the function g(x). We consider the sums 0; (p) 
and g,, for the functions f,(z) and for f(x), which is evidently bounded in view 
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of the uniform convergence of the f,(x): 
tn fn 
y= oe fo (ER) G (AE): = TE GCAP). (108) 


The points etn) are taken to be the same in all the sums. We form the dif- 
ference 


te 
oof) = SUED) — tp P19 (40). 


Given any positive e, in view of the uniform convergence of the f,(x) there 
exists an N such that | f(x) — fp(z)| < ¢ for p > N and for any z of A). Given 
p >N, and any choice of £¥, we have the following inequality: | 05, — a” |< 
< eG(A). It is clear from this that 0, tends to 0, as p — oo, the convergence 
being uniform with respect to n and the choice of points et”), By hypothesis, 
the f,(z) are integrable with respect to g(x), so that each of the sums aff) has 
a limit as n increases indefinitely: let the limit be Ap. This limit is in fact the 
integral of f,(x) with respect to g(x). We now show that the sequence of numbers 
A, has a limit. 

We have by (106): 

| Ap —Ag| < G (A) max | fp (x) — fy (a) I. 


By hypothesis, the right-hand side tends to zero as p, g— oo, so that Ap 
has a limit as p — ©, which we write as A. It remains for us to show that a4, > 
+ Aasn-— oo. Let us consider the difference A — o;,, which we can write as 


A—o, =(A— Ap) + (4, — of) + (of) —o,). (109) 


Given an ¢ > 0, we first choose p so large that | A — A,| < « and | ofp) — 
— 95,| < € for any n and &. 

Further, for all sufficiently large n, we have | Ap — of?) | < efor the p 
fixed above. Therefore | A — o,, | < 3e, whence it follows, since « is arbitrary, 
that o,,-+ A. 


VI. If the integral of f(z) with respect to g(x) exists, the integral of 
| f(z) | with respect to g(x) also exists, and we have 


ff (a) dg (2) | =) |f (x) | dg (a). (110) 


We bring in the usual notation m, and M, for f(x). If both numbers are posi- 
tive, we have the same strict bounds for | f(x) |. If both are negative, the strict 
lower bound and strict upper bounds for | f(z) | will be | M,| and | m, |, and 
the difference between the strict upper and strict lower bounds remains as before. 
Finally, if m, is negative, and M, positive, the strict upper bound of | f(z) | 
will be the greater of the numbers | m, | and M,, whilst the strict lower bound 
will be a number > 0. Thus the difference between the strict upper and strict 
lower bounds of | f(z) | will never be greater than this difference for f(x). Hence, 
if the difference S;, — s,, tends to zero for f(x) for some sequence of subdivisions, 
it will all the more tend to zero for | f(x) | for the same sequence of subdivisions, 
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i.e. the existence of the integral of f(x) implies the existence of the integral of 
| f(z) |. Inequality (110) is obtained at once from the corresponding inequality 
for the sums by passage to the limit. 


VII. If the functions /,(z) and f,(x) are integrable with respect to 
g(x), their product /,(z)f,(z) is integrable with respect to gz). 


We first show that, if f(x) is integrable with respect to g(x), then f#(x) is inte- 
grable with respect to g(x). We shall assume for the moment that f(z) is positive, 
and form the sums ¢@; for f(x) and for f%(x): 


n 


5 = > (My, ~ my) G (4,), 


oy = > (ME- mk) G (4,) = SO +m) Oh, mx) G (Ax). 


k=1 


If the first of these tends to zero for some sequence of subdivisions, the second 
will also tend to zero for the same sequence, since the factor M, + m, is bounded. 
Therefore, given a positive function f(x), the integrability of f(x) implies the 
integrability of f*(x). If f(a) is non-positive, there exists, in view of its bounded- 
ness, & positive constant such that f(x) + a is positive. This latter function is 
clearly integrable by property J, so that, by what has been proved, [f(z) + aP?= 
= f(a) + 2af(x) -+ a* is also integrable, whence it follows at once that f?(z) 
is integrable, this being expressible as a sum of integrable functions: f?(x) = 

= [f(z) + af — 2af(x) — a*. Finally, to prove that Az) J2(x) is integrable, 
we only need to write it as 


fy (2) fa (2) => th (0) + hea) — > P(e) — 5 Ae). 


The right-hand side is a sum of integrable functions. 


20. The existence of the general Stieltjes integral. Some sufficient conditions 
for the existence of the general Stieltjes integral are given below. 

THErorEM. Any bounded function f(x) is integrable with respect to the jump 
function g,(xz) in the sense of the general Stieltjes integral. 

Suppose first that the set ¢,,c¢,, ..., cp, of points of discontinuity of g(x) is 
finite, and let 6 be a subdivision of the basic interval, defined as follows: the 
ends of the interval, if they belong to the domain of integration, and the points 
C;, Cz, .-., Cy are independent elements of 6, whilst the remaining elements are 
the open intervals which are obtained after extracting the above-mentioned 
points. In each of these intervals g(x) retains a constant value, and the 
sums corresponding to the integral of f(z) with respect to g,(x) are obviously 
the same and are given by 


89 = Sy = Diu) (111) 


k=1 
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The integral of f(x) with respect to g,(x) therefore exists and is given by (111). 
Now let the set of points of discontinuity c;, of g,(z) be infinite. Let 6, be a sub- 
division of the basic interval carried out in the same way as above, the ends 
of the interval and the first n points of discontinuity c¢,, c., ...,¢, being taken 
as independent elements of the subdivision. We form the sums g,, for this 
sequence of 6,. The sum of the terms of 04, that come from the independent ele- 
ments of the subdivision is equal to 


> f (cu) Ye 


and does not depend on the variable points ¢,. Let us consider one of the open 
intervals (a, 8) appearing in the subdivision 6,. The term of the sum g,, cor- 
responding to it will have the form 


f(é) [9 (8 —9) —g(a+0)] (a<&< 8). 
If | f(z) |< DL, we have 
| f(§) | fg (8 —9) —g(a+9)] <Lfg (8 —0) —g(a+-9)], 


the difference g(B — 0) — g(a + 0) being the sum of the jumps of g(x) inside 
the interval (a, 8). The sum of the terms in g, corresponding to open intervals 
of the subdivision 6, will therefore have an absolute value not greater than the 
product of Z with the sum of all the jumps of g(x), except the jumps at the 
points ¢,,¢,, ..-+,C,. This sum tends to zero on indefinite increase of , whilst 
the sum corresponding to the c, gives in the limit the sum of the convergent 
series 


2 t() Vk (112) 
k= 


i.e. the integral of f(x) with respect to g,(x) exists and is given by series (112), 
which is what we had to prove. 

Thus the question of the existence of the general Stieltjes integral reduces 
to the question of the existence of the integral of f(z) with respect to the con- 
tinuous non-decreasing function g,(z). For this latter function, any sequence 
with indefinitely diminishing sub-intervals will be a regular sequence. If f(z) 
is monotonic in the interval 4, by the formula for integration by parts [2], 
it is integrable over A with respect to g,(z). Therefore, any monotonic function 
is integrable with respect to any increasing bounded function. 

Let us indicate a further class of functions integrable with respect to any 
increasing bounded function g(x). We shall say that f(x) is piecewise constant 
in A if this interval can be divided into a finite number of intervals A), AQ), 
..., 4, pairs of which have no common points, such that f(x) is constant 
in each A“) (k = 1, 2, ...,m). We denote the constants by the letter 6,. It is 
easily seen that a piecewise constant function is integrable with reapect to any 
increasing function g(x). For, if 6 is a subdivision of A into A“ (k = 1, 2, ...,m), 
the sums 8; and S; are the same: 


m 
8, = Sy = >, G (4) (113) 
k=1 
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On repeating this subdivision 6, we can see that the integral of the piece- 
wise constant function f(x) with respect to g(x) exists and is given by the sum 
(113). On applying property V of [19], we see that, if the function f(x) is the 
limit of a uniformly convergent sequence of piecewise constant functions in 
A, f(x) is integrable with respect to any increasing bounded function g(x). 

It is worth remarking further that the necessary and sufficient condition for 
integrability of f(x) with respect to g,(x) is the same as was indicated in [10]. 
We shall not dwell on the proof. 

Note 1. In [8] above, we defined a function of bounded variation g(x) 
in a closed interval [a, 6). This concept can be introduced similarly in intervals 
of a different type. Let us take say the open interval (a, b). We say that g(x) 
is a function of bounded variation in (a, 6) if it is of bounded variation in any 
closed interval [c, #] lying inside (a, 6), and V%(g) is not greater than a certain 
number for any such interval. As ¢ diminishes, the value of V*(g) does not dimi- 
nish, so that it has a finite limit as c tends to a, which we denote by V7},,(9). 
As x tends to b, it again has a finite limit, which we call the total variation of 
g(x) in (a, b). Formulae (44) define the functions g,(x) and g,(x), which appear 
in the canonical form of g(x) as the difference between two non-decreasing bounded 
functions: g(x) = g,(x) — g.(x). The general Stieltjes integral is then defined as 
the difference between the integrals with respect to the non-decreasing functions 


gi(z) and G2(2): 
Sf (w) dg (w) = J f(x) dg, (x) — J f(x) dg, (2), 


4 A 4 
and the existence of the integral of f(x) with respect to g(x) follows from the 
existence of the integrals of f(x) with respect to g,(x) and g,(x). 

On taking into account what has been said above, we can assert that, in 
the sense of the general Stieltjes integral, a function of bounded variation and 
a function which ts the limit of a uniformly convergent sequence of ptecewise 
constant functions are integrable with respect to any function of bounded variation. 

Note 2. A further concept of integral is sometimes introduced when 
integrating over a semi-open interval (a, 6}; this differs from the general Stieltjes 
integral only in dividing the basic interval (a, b] mto semi-open intervals of the 
type (c, d] with no common points, instead of dividing it into intervals of any 
type. Since the sets of numbers ¢, and S, can now only be part of the sets of 
the same numbers for the general integral, the number 7 can diminish, whilst 
the number I is increased. This implies that, if we had 7 = I for the general 
integral, we can have < I for the new form of integral, i.e. a function integrable 
in the sense of the general integral may be non-integrable with the new defini- 
tion of the integral. 


21. Functions of a two-dimensional interval. The concept of additive 
function of an interval and the constructions of the Stieltjes integral 
become more complicated on a plane, in three-dimensional space, and 
in general in n-dimensional space. Let us take the case of a plane. 
The arguments that follow readily suggest the modifications needed 
for spaces with a larger number of dimensions. Given a plane with X 
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and Y axes, suppose we have an interval 4, on the X axis and an inter- 
val 4, on the Y axis, the intervals being understood in the general 
sense as discussed in [20]. These intervals 4, and A, define an interval 
A on the plane, i.e. a point (zx, y) is reckoned to belong to A if x belongs 
to A, and y to 4,. There can be many varied types of plane interval 
(domain). For instance, an interval may be closed, and given by the in- 
equalities a< x2 < b,c < y < d; or it may be semi-open, given by 
ax<2<b,c<y<d;orsemi-open, givenbya<xr<bc<y<d; 
or semi-open, given by a<uw<b, c<y<d; or it may be a seg- 
ment of straight line parallel to the X axis: a<r<b, y=c, or 
the point z=a,y=c, and so on. The numbers a, b, c, d appearing 
in the above inequalities may be finite or infinite. Semi-open intervals 
given by inequalities of the form a<2<b, c< y<dwill be of 
great importance to us later, and for the sake of brevity only intervals 
of this type will be described in future as semi-open. 

Suppose that a non-negative function G(A), having additive and 
normal properties, is defined for an interval A belonging to the 
basic interval A,. In other words, if 4 is the sum of intervals 4% + 
+ AM 4... + A™, with no common points, then 


(4) = SG (4%, 


k= 


-_ 


and if 4,, 4,,... is a vanishing sequence of intervals, G(4,) — 0. 

We shall describe some properties of such functions. The fact that 
G(4) is non-negative and additive implies at once that G(4,) is the 
greatest value of G(4) for 4 belonging to 4,. We shall simply write 
G(P) for the value of G(4) when 4 is an individual point P; we have 
G(P) > 0 since G(A) is non-negative. It may happen that G(P) = 0 at 
every point P belonging to 4). In this case G(4) is said to be continuous 
in A,. If G(P) > 0, P is called a point of discontinuity of G(4). It may 
easily be shown that the set of points of discontinuity is finite or de- 
numerable [cf. 6]. Let us consider the points of discontinuity at which 
G(P) > 1. Since G(4) is non-negative and additive, the number of 
such points is not greater than the integral part of the number 
G(A,). Similarly, the number of points P at which G(P) > 1/2 is not 
greater than the integral parts of 2G(4,), and so on. Hence we can 
conclude, precisely as in [6], that the set of points of discontinuity is 
finite or denumerable. If it is denumerable, and P;, P,, ... is a sequ- 
ence of points of discontinuity, the series formed from the positive 
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numbers G(P,) is convergent and the inequality holds: 


SG (P,) <@(A)). (114) 
k=l 

Similarly, if 7 is the piece of straight line parallel to one of the axes 
that appears in the constitution of 4p, it is called a line of discontinuity 
if G(l) > 0. Every straight line parallel to an axis and passing through 
a point of discontinuity gives a line of discontinuity. But there can 
also be lines of discontinuity that contain no point of discontinuity. 
It can be shown precisely as above that, if lines of discontinuity 
exist, the set of them is either finite or denumerable. Here, we take 
the complete segment of straight line parallel to an axis that appears 
in the composition of 4), without cutting it into pieces. Suppose we 
have a sequence of intervals 4, (n = 1, 2, ...) such that 4, contains 
An41, and the point P or all the points of the line J are the only points 
common to all the 4,. We say in this case that A, is a system of embed- 
ded intervals tending to P or / (1 is a straight segment parallel to one 
of the axes). By making use of the normality of G(A), it can be shown 
that, if 4, is a system of embedded intervals tending to P or 1, then 
G(4,) > G(P) or G(l). We give the proof for the case of a point P, on 
the assumption that P lies inside all the 4,, which are open. We draw 
through P straight lines parallel to the axes, and split each of the 4, 
into the following pieces: the point P, the four segments of the straight 
lines drawn that lie in 4,, and the remaining four intervals. On indefinite 
compression of the 4, to the point P, all the constructed elements of 
subdivision, apart from P, will form a vanishing sequence of intervals, 
and G(A) will tend to zero for each of them, by virtue of the normality 
of G(4). On recalling that G(A4) is additive, it is now easily seen that 
G(4,) — G(P). The other cases for a point P, and the cases for a line I+ 
can be similarly treated. 

The definitions given above become perfectly clear if G(4) is inter- 
preted as the mass located on the interval 4 when matter is distrib- 
uted over the basic interval 4,. If say G(P) > 0, we have a concen- 
trated mass G(P) at the point P. Similarly, if G(i) > 0, the mass G(J) is 
distributed in some manner along the line J. For instance, suppose we 
have a semi-open interval 4, (0 << 27 < 2;0< y < 2), and that mass 
is distributed in it with unit linear density along the line J, (x = 1, 
0< y < 1). In this case G(4) is equal to the length of the part of l, 


}{ Later on, we shall prove this property of G(4) not only for intervals, 
but for much more general point sets. 
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which is contained in 4. At any point P of A), the function G(P) = 0. 
If 4, is a system of embedded intervals tending to 1), their areas tend 
to zero, whilst G(4,) = 1 for all n. 

A function G(A), defined for all intervals belonging to 4), may 
readily be extended in general to all intervals of the plane, the function 
meantime remaining non-negative, additive and normal. For, if J is 
any interval, and JA, is the product of intervals A and Ay, i.e. the set 
of points belonging simultaneously to A and 4), this product is an 
interval belonging to 4), and we obtain our extension of G(4) if we 
put G(A) = G(AA,). This extension of G(4) may easily be interpreted 
physically if we regard G(4) as the mass contained on the interval 4 
when the total mass is distributed over the interval 4). 

We shall see later that, if a non-negative, additive and normal 
function is given only on semi-open intervals, it can be uniquely 
extended not only to whole intervals, but also to a much wider class 
of point sets on the plane whilst retaining the properties mentioned. 
Its values at points and on lines / can be obtained, as indicated above, 
by a passage to the limit, with the aid of a system of embedded inter- 
vals, the limit being independent of the precise system of embedded 
intervals taken. 

A function G(4) can easily be split into a jump function and a con- 
tinuous part. Let G(4) be non-negative, additive, normal and defined 
for all A belonging to 4,. We write P; (k = 1, 2, ...) for its points of 
discontinuity. We define the jump function G,(4) as follows: G4(4) is 
equal to the sum of the values of G(P,) at the P, which belong to 4. 
It may be remarked that, if there is a denumerable set of such points, 
the series consisting of the G(P,) is convergent. The difference G(4) — 
— G,(4) will be denoted by G,(4). This latter function has no discon- 
tinuities. Both the functions G,(4) and G,(4d) are easily seen to be 
non-negative, additive and normal. The normality follows at once 
from the inequalities: 0 < G,(4) < G(A4) and 0 < G,(4) < G(4). We 


write our expansion of G(4) as 
G (A) = (4) + G, (A). 


22, Passage to point functions. We can form an interval function 
G(A) by using a point function g(x, y). Suppose we have a point 
function g(x,y), which is defined if x belongs to the interval A 
of the X axis, and y to the interval AO of the Y axis, where A” 
and A‘ define a basic interval 4, on the plane. Suppose further that 
g(x, y) is non-decreasing with respect to each variable for any fixed 
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value of the other, and that 


for any x and y and any positive A and k. It follows from what has been 
said that the limits g(z, y + 0) and g(z + 0, y) exist, where 


(1) g(x,y +0) > g(x, 4+ 0); 9 (X_, ¥ — 0) > g (4, y — 0) 
for 2, > 2, and 

(2) g(v+0,y) > g(%+0,%); g (x — 0, Y2) > g(% -- 0, y) 
for ¥2, > Y;- 


In view of the monotonic property expressed by these formulae, 
we can pass to the limit successively with respect to the variables. 
We form the limits 
A= lim ae g(a +h, y+tk) and B= lim lm ge +h, y +k) 


k-++0 h~ h+40 k++ 
and show that they are equal. We have g(x +h,y +k) > ox +h, 
y -+ 0) and, on passing to the limit first with respect to A, then with 
ee to k, we get A > B. Similarly, it can be shown that B > A, 
so that A = B. We naturally write the symbol g(z + 0, y + 0) for the 
quantity A = B. It can similarly be shown that 
lim lim g(z —h,y —k) = lim lim g(a —h, y —k), 
k~>+0 h+4-0 h>~+0 k++0 
and this limit is naturally written symbolically as g(x — 0, y — 0). 
So far, we have only used the fact that g(x, y) is non-decreasing with 
respect to each variable. We now make use of condition (115). We form 
the limits 


A,=lim limg(a+h,y—k) and B,=lim lim g(x+h,y—k) 


k-+ +0 h-+40 k+40 h++0 
and show that A, = B,. Putting 0<h,<handO<k, < k, we can 
write, in view of (115): 
g(ut+thy—k,)—g(e@+hy—k)—gia+h,y—kh)t+ 
+ g(t +hyy—k)>0 
On letting first h,, then k, tend to zero, we get 
g(u+h,y—0)—g(@+h,y—k)—-A, + 9(4+0,y—k) >0 


If we now let first h, then k tend to zero, we get B, — A, > 0, ie. 
B, > A,. It can similarly be shown that A, > B, so that A, = By. 
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We naturally write G(iz+0,y—0) for the quantity A, = B,. 
Similarly, 


lim lim g(z2 —h,yt+ k) = lim lim g(e#—h,y+hk), 


k~+0 h++0 h~+0 k++0 
and we write g(z —0, y+ 0) for this repeated limit. It may easily be 
shown that, in all the four cases considered, the same limit is obtained 
for any simultaneous convergence of h and k to (+0). For instance, 
we can assert the following: given any positive «, there exists a positive 
n such that 

lg(z+h,y—k)-—g(x+0,y—0)|<e 
for O<h<y and 0O<k<7n. 

The symbols g(z + 0, y + 0) thus have a definite meaning when 
condition (115) is satisfied. A non-negative, additive and normal 
function G(A) is formed with the aid of g(z, y) as follows. Let 4, and 
A, be intervals on the axes defining the interval 4 on the plane, and 
let a and b be the boundary points of 4,, and c, d the boundary points 
of A,. Now G(4) is equal to the following expression: 

+g(a@+0,ce+0), 
where the + sign is taken when the interval is closed at the right-hand 
end or open at the left-hand end, and the — sign when the interval is 
open at the right-hand end or closed at the left-hand end. For instance, 
we have in the case of the closed intervala<za<b,c<y<d, 
G(A) =9(b+0,d+0) —g(b+0,c—0)—g(a —0,d-+0) 4+ 
-+ g(a —0,c—0). 
In the case of the semi-open intervala <x <b,e<y<d: 
G (4) =g(b+0,d+0)—g(b4+0,¢+0)—g(a+0,d4+0)+ 
+ 9(a+0,¢+0) 
and in the case of the point x = a, y = ¢: 
G(A4)=g(a+0,c+0)—g(a+0,¢c—0)—g(a—0,ce+0)+ 
+ g(a—90,c—0). 
Conversely, if G(4) exists, we can easily form the point function 


g(x, y), with the aid of which G(4) can be obtained by the method 
indicated. Suppose say that G(4) is given throughout the open plane. 
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Now, g(z, y) can be taken equal to the value of G(4) for the interval 
A,,y, which is defined by the following intervals on the axes: —co < 
<v’ <4; -—-c <y’ < y. The g(z, y) thus constructed is continuous 
with respect to z and y from the right. It may be observed that the 
left-hand side of (115) is unchanged if an arbitrary polynomial of the 
first degree in xz and y is added to g(a, y). If g(z, y) is a continuous 
function, the G(4) corresponding to it has no points or lines of dis- 
continuity, and vice versa. If g(z, y) has a discontinuity at the point 
(x, y), but 


g(x+0,y+ 0)—g(z—0,y¥y+0)—g(x+0,y¥—0)+ 
+g9(x—0,y—0)=0, 


this point is not a discontinuity of G(4). 

Let us take as an example the above-mentioned mass distribution, 
in which mass is distributed with unit linear density on the segment 
x=1,0<y<l. Here g(z,y) =Oife<lory< 0; gz, y) =yif 
zx>landd0<y<19¢(z7,y)=life>landy>1. 

A similar treatment can be given of intervals in three-dimensional 
space with axes X, Y and Z.Suchan interval is defined by three intervals 
on the axes. In addition to points, and lines parallel to the axes, we 
also have to consider planes parallel to the coordinate planes. Other- 
wise, all the arguments and results are the same as above. 


23. The Stieltjes integral on a plane. The concept of Stieltjes integral 
is readily generalized to the case of a plane. Obviously, we are con- 
cerned here with the subdivision of a two-dimensional interval. If 4 is 
an interval of the plane, defined by intervals 4, and 4, on the axes, 
a subdivision of 4 is a division obtained by dividing 4, and A, into 
sub-intervals. Every sub-interval of A is defined by a sub-interval of 
A, and a sub-interval of d,. Figure 1 illustrates a subdivision of a 
semi-open interval, carried out as indicated, into six sub-intervals. 


y 


Fie 1. Fig 2. 
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A subdivision of quite a different type is illustrated in Fig. 2; the 
dividing lines here end in the middle of 4, and it cannot be obtained 
by the above method of subdividing 4, and 4J,. But we can easily pass 
from a subdivision of the second type to a subdivision of the first type 
by continuing all the dividing lines, and we shall in future only use 
subdivisions of the first type; this is a matter of minor importance. 
A subdivision 6’ is described as an extension of a subdivision 6 if the 
subdivisions of 4, and 4A, corresponding to 6’ are extensions of the 
subdivisions of A, and A, corresponding to 6. If 6, and 6, are two sub- 
divisons of 4, and 6), 62) and jy, 6®) are the corresponding sub- 
divisions of 4, and A,, the product 6, 6, is the subdivision given by the 
subdivisions 6{” 6° and 6% 6° of A, and A,. The subdivision 6, 4, 
is evidently an extension of 6, and 6,. A further remark: if 4’ is an 
interval belonging to 4, there exist subdivisions of 4 in which 4’ is 
one of the sub-intervals. 

It is easy to form an analogue of the Stieltjes integral in the 
case of a plane, three-dimensional space, and in general n-dimensional 
space. We shall confine ourselves to the plane case. The constructions 
are essentially the same in the other cases. 

Let A, be a finite interval on the plane, on which are given a point 
function /(P) which is uniformly continuous and therefore bounded 
and a non-negative, additive and normal interval function G(J). 
Let 6 be a subdivision of 4, into intervals A), 4,,..., 4,, pairs of 
which have no common points. We choose a point P; in each A; and 
form the sum 


Oy = SF (Px) G (Ay). (116) 
1 
As in [4], it can be shown that this sum has a definite limit when 


the greatest of the diameters of the domains 4; tends to zero. This 
limit is in fact the integral of /(P) with respect to G(4): 


{ f(P)@ (dd) = lim Sf (P,) G(A,). (117) 
As k=1 


If f(z, y) is continuous in the closed interval 4) (a, < x < bj; 
a,<y<b,) and g(x,y) has the above-mentioned properties, the 
Stieltjes integral can be defined as the limit of the sum 


Oo, = aol (Sis 2) [9 (ae Yr) — 9 (Lea 1) — Y (Les Ya) + 


p @q 
k=l1=1 
+9 (a Yi-1)], (118) 
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where 
A= Bye... SB Ke =d A, =YH<KY<... < 
<Yg—1 < Yq = Oy; 
Try SE See Ya SKY (119) 


on indefinite refinement of the sub-intervals. 

As in [4], it can be shown that the Stieltjes integral exists, on the 
assumption that /(P) is continuous inside 4, and bounded, if G(A) 
satisfies a subsidiary condition which we state next. 

Let A™ be closed intervals lying inside 4), which expand and tend 
to A, in such a way that any interior point of A, comes inside A for 
sufficiently large n. We require that G( Am) > G(A,). This is analogous 
to the continuity of g(x) at the ends of the interval, which we spoke 
about in [4]. 

The Stieltjes integral can be defined for the whole of the plane: 
—oo << 7 < +00, —co < y < +0, which we write as Q. Let G(4) be 
a non-negative, additive and normal function, defined for all intervals 
both finite and infinite belonging to Q. Let A” be a sequence of inter- 
vals indefinitely expanding in all directions, e.g. let A” be —n < x < 
<n; —n<y<n. Since G(A) is normal, G(4”) — GQ) 0 as 
n—» co, If f(P) is continuous and bounded in Q, the Stieltjes integral 
(117) exists. Here, the sequence of subdivisions must be such that, 
given any fixed n, the greatest diameter of an interval having points 
in common with A‘ tends to zero. 

The domain of integration may not be the interval A, but may be 
a domain S which represents the sum of a finite number of intervals. 
We can perform as many subdivisions as desired, form sums (116), 
and pass to the limit. The integral over S reduces to the finite sum of 
integrals over the intervals into which 8 can be split, and obviously 
does not depend on the method of subdividing S. 

The properties of the double Stieltjes integral are precisely analogous 
to those given above for the simple integral. 


24, Functions of bounded variation on the plane. The treatment of 
functions of bounded variation on a plane is in many respects similar 
to the above. The statement will be somewhat different since our 
discussion will be in terms of interval functions instead of point func- 
tions. Let G(4) be additive and normal and defined for all intervals 
(in the usual sense of this word), belonging to some basic interval 


24] FUNCTIONS OF BOUNDED VARIATION ON THE PLANE 67 


A,. We shall not assume that this G(4) is non-negative. Let 4,,..., 4p, 
be a subdivision 6 of the interval 4, into sub-intervals. We form the 
sums 


y= > 1414). (120) 
k=l 


DerFinition. If, given all possible subdivisions 6, the set of values of 
t, is bounded, G(A) is said to be a function of bounded variation in the 
interval A,, whilst the strict upper bound of these sums ts ts called the 
total variation or simply the variation of G(A) in the interval A,. We 
shall denote it by the symbol V4 (G). The properties of sums f, and 
of the total variation are precisely similar to the properties discussed 
in [8], and we shall state most of these properties without proof. 

If 6’ is an extension of the subdivision 6, then t; > ¢,. If G(A) is of 
bounded variation in 4,, it is of bounded variation in any interval A’ 
belonging to Ay, where V4 (@) < V4 (G). Any non-negative or non- 
positive function G(A) is of bounded variation. If the interval 4’ 
belongs to 4,, we have 


|G (4")| < Vy, (@), (121) 


and G(A), of bounded variation in 4), will be bounded (in absolute 
value) for all 4 belonging to 4). Every linear combination of functions 
of bounded variation is of bounded variation. Theorem 3 of [8] holds 
for a product and quotient. The total variation V ,(@) is a non-negative 
function defined in 4,. We can show by repeating the proof of theorem 
4 of [8] that V(4d) = V,(G) is additive. Let us show that V(4) is a 
normal function in 4). Let 4d, (m= 1, 2,3,...) be a vanishing 
sequence of intervals. We have to show that V(4,,)-—> 0. Let « be a 
given positive number. We take a subdivision 6: 4%, A™, ..., A” 
of A, for which 


Ws 


| (4M) | > V (Ay) —€. (122) 


pe 


k 


For any k of the series of numbers k = 1, 2, 3, ..., p, the product 
Am A” (m = 1, 2, ...) is a vanishing sequence. Since G(A) is normal, 
we can fix an m = m, such that 


|G (4,4) | or for m>m, 


(123) 
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We fix any m > m,. Each interval A“ is subjected to e further sub- 
division such that 4,, J is a sub-interval. Let A (s = 1, 2, ..., m,) 
be the remaining sub-intervals with this subdivision of 4“). A sub- 
division of 4, is thus obtained which is an extension of subdivision 6, 
so that inequality (122) holds for it all the more, i.e. 


S164 am) |-+ $ $1G(4%)|> V (A,) —e. 


k=1 s= 


Since V(A) is additive, and | G(4® | < V(A™), the last inequality 
gives us 


S14 hn d)| + YSU (AM) > Ve) + SV (AP) — 
k=1 s=1 k=l s=1 


or 


p 
V (An) < |G (4,4) |+e for m> my. 
k=1 
On taking (123) into account, we get 


Dp 
2a for m>Mp, 
— Pp 


whence, in view of the arbitrariness of «, it follows that V(4,,) > 0. 
Thus V(A) is a non-negative, additive and normal interval func- 
tion in Ay. We define further non-negative, additive and normal 
functions in accordance with the formulae 


G@,(4)=5[V(4)+@(4)]; @,(4)=>[V (4) —@(4)] (124) 


and thus obtain the canonical form of a function of bounded variation 
G(A) as the difference between two non-negative, additive and normal 
functions: 


G (A) = G, (A) — G@, (A). (125) 
Given some analogous form 
G (A) = GT (4) — GF (A), (126) 
we have for any J belonging to 4): 
@,(4)<@(A) and @,(4) < GF (A). 


Conversely, if G(4) is the difference between two non-negative addi- 
tive and normal functions, G(4) is a function of bounded variation. 
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If the interval A is the point P, V(P) coincides with G(P). If G(P) = 
= 0, then V(P) = 0, ie. the continuity of G(4) at a point implies the 
continuity of V(A) at that point. The situation is rather more compli- 
cated for segments parallel to the axes. Let us interpret G(4) as the 
amount of charge distributed in the interval 4. If electrical charges 
whose total sum is zero are distributed say with definite linear density 
on some segment J of a straight line parallel to an axis, G(l) = 0. 
On the other hand, we have V(i) > 0, since V(i) gives the total sum 
of the charges when all the charges are taken with the plus sign. 

An additive and normal function G(4) could be defined only on 
semi-open intervals and use made only of subdivisions into semi- 
open intervals. All the arguments given above could be repeated in 
this case, and (125) obtained. Non-negative, additive and normal 
functions G,(4) and G,(4) of a semi-open interval could be extended 
to all possible intervals, and hence (126) would give us an extension 
of the given G(4) to all possible intervals, and this function would be 
additive, normal and of bounded variation for all possible intervals. 

By using canonical form (145), we can define the integral with 
respect to G(4) in the usual way: 

§ f(P)G(d4) = f f(P)G (dd) — f f(P)@, (dd). (127) 


A) At) A) 


Given a closed interval 4°(a,< 2 < b,;a,< y < b,), by using expres- 
sion (115) we can define a function of bounded variation g(x, y) and a total 
variation as in [8], by starting from the sums 


pq 
t= SN [9 (tp yp — 9 (ep Y) — 9 (Le W-1) £9 (Te-vy Yi-1) |- (128) 


25. The space of continuous functions of several variables. Let /(z, y) be a 
continuous function given in a finite closed interval 4,(a, << z < b;4a,< y< 5). 
With the aid of the linear transformation «’ = ax + b, y’ =cy +d (@ and 
c #0), we can reduce 4, to the interval 0< 2’ < 1;0< y’< 1), and a 
continuous function remains continuous after transformation. We shall there- 
fore assume that the original interval 4, has already been transformed to 
0<2x<1,0<y< 1. We form the Bernshtein polynomials [II; 154] for 
f(z, y): 


a aengh ies: Oe "2 = 
Prun(@,y) = > (= . <r) Oma at y! (1 — a)" —yyr (129) 
k=01=0 


As in [II; 154], we can show by using the uniform continuity of f(z, y) in 
A, that Py, n(x, ¥) + f(x, y) on indefinite increase of m and n uniformly in 
A,. Now suppose that we have an f(z, y) which is continuous on the bounded 
closed set F [IV; 157]. By using a linear transformation as above, we can 
assume that F belongs to the A, indicated above. Further, we can extend 
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t(x, y) to the whole of 4, whilst preserving its continuity and the maximum 
of | f(x, y)|{IV; 157]. We can construct for the f(x, y) thus extended a sequence 
of polynomials Py, n(x, y) such that Pm n(x, y) > f(z, y) uniformly in 4, and 
all the more uniformly in Ff’. 

We now give a treatment similar to that of [14] of the space C of functions 
continuous in A, with the following definition of the norm: || f || = max | f(x, y) | 
in 4,. As in [15], it can be shown that the general form of linear functional in 
C is the Stieltjes integral 

® (f) = JS f(a, y) G (ad), (130) 
4 
where G(A) is the function of bounded variation in 4, characterizing the func- 
tional @(f). 

When. defining a function of bounded variation, use can be made of sums 
(128), whilst integral (130) is defined as the limit of sums (118) on indefinite 
subdivision. Information on the space of continuous functions on bounded 
closed sets can be found in Radon’s article ‘Linear functional transformations 
and functional equations’ (O lineinykh funksional’nykh preobrazovaniyakh i 
funktsional’nykh uravneniyakh) (Uspekhi matematicheskikh nauk, Vyp. 1, 
1936) and in the book by F. Riesz and B. Szekefalvi-Nagy, Lectures On Func- 
tional Analysis. 


26. The Fourier- Stieltjes integral. Let us consider a function expressible by 
the Fourier-Stieltjes integral 


4 00 
p(t) = § el dg(z) (-wo<t<+0), (131) 


where g(x) is a non-decreasing bounded function, continuous for x = + 0, 
i.e. 


fe ears g(x); g{+00)= lim g(a). 


— = Oo X— + co 


Integral (131) obviously exists, since the function e* ig continuous and bounded 
[4]. The elementary properties of the function g(t) are as follows. We have 


spe +o 
p(t) < f je {dg(z) = J dg(x)=9(4 -) -g(— 9) = 9 (0), 


or | p(t) | < (0), ie. g(t) is bounded. The identity 
9 (—t) = 9 (4) (132) 


also follows at once from (131). 
Let us show further that g(t) is uniformly continuous in (— oo, +o). We 
have for the absolute value of g(é + h) — ¢(t): 
+ 2 


+ co 
F . h 
joe+m— Ol < f ohlije*—1| dg(ay—2 f |sin-F [ag ce), 


(133) 
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We first make n so large that 
[g(—n)—g(—~)] <e; [g(+0)—g(n)] <e. 


We next fix an 7 independent of ¢ such that, in the interval —n < x <n: 


2 


in| <e for |h| <7. 


With this, we obtain by (133): 
[ey (¢-+h) —p(t)| <[2+ 9(n)—g(—n)Je< [2+ 9(4+ 0)—g(—~)]e, 


whence follows the uniform continuity of g(t), since ¢ is arbitrary and 7 is 
independent of t. To prove a further property of y(¢), we take any m real numbers 
t,, te, ..-,¢4m and form the Hermitian form: 


m — 
x! Pp — ty) § &, (134) 


a= 


in the variables €,. On taking (131) into account, we can write 
— +o A == 
P (tp — tq) Ep Eg = § ol'P* Ey 0G* &, dg(a), 


and we get the following expression for the Hermitian form (134): 


m os +00 m ; 2 
> P(tp—ty) pig = J | Ses*é| dgix), 
P,q=1 —eo |s=1 


whence it follows at once that, given any m and any choice ¢, of the values of 
t, the Hermitian form (134) is non-negative, i.e. 


m 
> VP (tp — ty) Ep &y < 0. (135) 
p,q=t 
Let us introduce a new concept. We shall call g(t) positive definite if it is 
continuous and bounded in (— o, +), satisfies identity (132), and if, in 
addition, the Hermitian form (134) is non-negative for any m and any choice 
of points ¢,. It follows from the above arguments that, if the function 9(é) is 
expressible by a Fourier-Stieltjes integral (131) with a function g(x) of the 
indicated type, it is positive definite. It can be shown that, conversely: every 
positive definite function y(t) is expressible by integral (131) with a function 
g(x) of the type indicated. (Bochner, Vorlesungen tiber Fouriersche Integrale 
p. 74, or Math. Annal. Bd. 108). Let us return to integral (131) and write 
g(x) as the sum: g(x) = g,(x) + 9,(z), where g,(x) is the jump function and 
g,(z) is the continuous part of g(x). The function g(t) can now be written 
as the sum g(t) = 9,(¢) + ¢,(¢), where 


+e feo 
g(t) = § oe dgy(x); g(t) = § ef dg, (x). (136) 


— oo — oo 


Let 2, (k = 1, 2,...) be the abscissae of the points of discontinuity of g(x) 
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and a, = g(x, + 0) — g(x, — 0), where obviously a, > 0, and the series con- 
sisting of the a, is convergent. 


We have an expansion of 9,(t) as a uniformly convergent series: 
, (t) = age kf. (137) 
k 


If the number of points x, is finite, the sum written will be finite. 
We bring in the so-called mean value of any continuous function F(t), 
defined in the interval (— ©, +0), 


M{F ()} = Tim = | F(t)dt, (138) 


if the limit on the right exists. It may easily be shown that, for the ¢,(t) defined 
by (136) with the continuous function g,(z): 


M {92 (t)} =0. (139) 


For, on performing the integration under the integral sign, we get 


i +w +0 ; 
sin wx 
ae | Q, (t) dt = | aes dg, (x). 


We split the interval of integration [— 0, + 0] into tbree parts: [— ©, —a], 
[—a, +a], [a, +0]. Using elementary inequalities for the integrals, we have 


+w 


z f mat) < 


@ 
—a 


<  [ge(— 4) — 96(— 29)] + ao Lie (0°) — ge (a)] + Le (2) — 9¢(— a]. 


Let ¢ be a given positive number. In view of the continuity of g,(z) at x = 0, 
we can choose an a@ 80 small that g,(@) — g,(—a) < «/2. With a fixed, the first 
two terms on the right-hand side tend to zero as w + oo, so that, for all suffi- 
ciently large w: 

+a 


1 
sof eo ar] <e, 


—a 


whence (139) follows, since « is arbitrary. The possibility of integrating with 
respect to ¢ under the Stieltjes integral sign is easily justified. We now consider 
the function ¢,(t) e~, which we can write in the form 


+ 
Pr (t) eM i) el BrOTGg: (x), 


or, on changing the variable of integration, 


+ 
Pz (t) ey iat os i elt dg, (e+ A) ’ 


—3 
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and, since g,(z) is continuous, we have for any real A: 
M {¢ (t) oy =0. 


Further, if we make use of the uniform convergence of series (137) in the 
interval {— 0, +), as also the fact that 


M {e'*} =0 for A¥<0, 
we obtain 
M {y, (tho "sp = ag (= a), 
and 
M {¢, (¢) @ Mh =0, 
if A does not coincide with one of the x,. The above arguments lead us to the 
following general result: if y(t) is expressible by integral (131) with a non- 
decreasing bounded function g(x), then M {y(t) eo} = g(A + 0) — g(A — 0), 
and the right-hand side is zero if 4 is not a point of discontinuity of g(x). The 
mean value of the product 9(é) e~ represents a generalization of the Fourier 
coefficient for a periodic function. We shall return later to generalized Fourier 
coefficients in connection with a generalization of the closure formula [29]. 
The next section is concerned with an inversion formula for integral (131), i.e. 
a formula expressing g(x) in terms of 9(t). 


27. Inversion formula. In Volume II, we proved the second mean value 
theorem, the possibility of expanding a function in a Fourier series, and the 
Fourier integral formula, on the assumption that the function in question 
satisfies the Dirichlet conditions. On glancing back over the proofs, it may 
easily be seen that they retain their force if the Dirichlet conditions are replaced 
by the requirement that the function be of bounded variation. We have the 
following result for the Fourier integral: if f(~) is a function of bounded variation 
in any finite interval and is absolutely integrable in the Riemann sense over 
an infinite interval, the formula 


++ 
1 tix 
t) = —— | f(x)e“dz 140) 
v(t) =e ) (2) 
implies the following inversion formula: 
+ 
1 — Hix 
xz} = — the dt, (141 
i (@) = J p(t) ) 


where the last integral has to be understood in the sense of the principal value 
and f(z) on the left-hand side must be replaced at a point of discontinuity by 


- (f(x — 0) + f(z + 0)]. This result is sometimes written in a rather different 
form, viz. the formula 
-} co 
p(t)= § f(aye™dax (142) 


implies that [IIT,; 130]: 
+e! 2a 


ceed —itx aa ‘ —itx 
f(z) = VP | ewe di = On eae fee dt. (143) 
—N 
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Our problem is to construct an inversion formula for integral (131). The 
form that this formula must have may easily be guessed. We use the following 
heuristic method, which naturally does not have the force of a proof. We 
replace dg(z) in integral (131) by g’(x) dz, on the assumption that g’(x) is the 
derivative of g(x). We thus get 


+0 
p(t)= § g’(x)e™ da, 


and the inversion formula for the ordinary Fourier integral gives us 
1 +N 
, ree, — ix 
(2) = 55 Jim [ (year. 
—N 


On integrating both sides of the last formula with respect to x, say from 
x = 0 to some value z, and carrying out the integration on the right-hand side 
under the integral sign, we finally obtain the inversion formula: 


+N 

ene yiat a 144 

g(2)—9(0)= 5 lim [ 9) — a. (144) 
—N 


The value of g(x) at a fixed point x = 0 has appeared on the left-hand side, 
since g(x) is evidently only defined up to a constant term. We must now 
provide a strict proof of (144). 

Having fixed the value of 2 in some way, we consider the following function 
of the variable y: 


h(y)=g¢g(y+2)—g({y)- (145) 


Since it is the difference between two increasing functions, it will be of bounded 
variation in any finite interval. Let us show further that it is absolutely inte- 
grable over an infinite interval. We shall assume for definiteness that z > 0. 
The proof is essentially the same if « < 0. The function g(y) is increasing, so 
that we can write 


q q q q 
: JA (uy) |dy = J (aly +2) —oty)ldy = f 9 (y + 2) dy = g(y)dy, (146) 


(q > P) 
or, on carrying out the change of variable: y+ 2 =z in the first integral: 
q q+x q gtx 


+x 
{ |A(y)fdy= § g(z)dz— §g(z)de= § g(z)de— f gz) dz. 
Pp ptx p q p 


On taking into acvount the fact that 9(p) < g(z) in the interval [p,p-+ x] 
whilst g(g + x) > g(z) in the interval [¢g,¢g+ 2%], we get 


Slew) iay <(9(¢+2) —g(p)]z. 
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It is clear from this that integral (146) is as small as desired for sufficiently 
large p and any g > p, which in fact proves that h(y) is absolutely integrable 
over an infinite interval. Fourier’s formula is therefore applicable to function 
(145): 


gy + 2) = 79. (y) => =} toes fs ely [ if (g (2 +2) a g (z)] elae| ae 
or, with 4= 0: 7 
+N +. 


9 (x) — 9 (0) = 5— jim [[ [tee+2—eeieltar | dt. (147) 
aN 


a 


Let us consider the inner integral J and apply to it the formula for integration 
by parts: 

+ ce 

[eat +2)—9@)} 


+ ca 
1 Hz mae 
tmz | Us (z +2) —g(2)] de = 
whence, on using (131), we obtain 
+ co 


I= 9(t) — =, | lag (2 +2). 


On carrying out the change of variables of integration: z -+- x =u in the 
last integral, we finally arrive at the following equation: 
J _— oe lix 


3 (t). (148) 


+ co 
| [9 (z +2) —g(z)] eM” dz = 


On substituting this in (147), we obtain the inversion formula (144). 

We recall that the left-hand side f(x) in (143), giving the inversion of the 
ordinary Fourier transformation, has to be reckoned equal to the arithmetic 
mean of its limits from the left and right at points of discontinuity; we know 
this from [II; 143]. Consequently, the same remark holds for the left-hand 
side of (143) and the left-hand side of inversion formula (144). Inversion for- 
mula (144) for integral (131) also holds when g(x) is a function of bounded 
variation. To see this, we only need to write g(x) as the difference between 
two increasing functions, split integral (131) into two integrals and apply to 
each the inversion formula proved above. 


28. Convolution theorem. As we know, the inversion formula for the ordinary 
Fourier integral is directly connected with the inversion formula for the Laplace 
integral [IV; 44, 45], and we had a convolution theorem for the latter integral. 
A similar theorem holds for the Fourier-Stieltjes integral. Let g,(x) and g,(z) be 
two functions with the properties indicated above for g(x). The following general 
Stieltjes integral will exist: 

+ 00 
gs (t)= J ge(t— 2) dy (2). (149) 


76 THE STIELTJES INTEGRAL [28 


It is easily shown that g,(x) has the above-mentioned properties and that, 
if g,(z) and g,(x) are continuous, g,(x) is also continuous. 
We form the Fourier-Stieltjes transformation for functions g;(zx): 


+o 
v(t) = § ef dg, (x) (150) 
The convolution theorem amounts to the assertion that these transformed func- 
tions satisfy the following simple equation: 

Ps (t) = Yr (t) Ye (t). (151 ) 

We apply (148) to function y,(¢): 

+ co 
—6 > 
j—wltl= | [me+u) a] ode. 


1 —iut 


On replacing g,(t) by its expression (149), we get 


er a 
1— —tut 
=F with= [ 4) f [oe +u—2)—9, @—2)] dg, (=) ol ae. 


We change the order of integration. We shall not dwell on the proof that this 
is possible, since it will follow from a general theorem on changing the order 
of integration which will be proved later. We get 


ga iut fe (ess 
i= ty, (t) = f | [92 (2 + u— 2) — 9, (2 —- 2)] el gz! dg, (x). 


uw 
—- —o 


We replace the z in the inner integral by the new variable of integration y 
given by z — x = y. The last equation can now be written as 


‘ +o f +e 
—6 


1— eM it itx 
=J— w= f yf [s+ —o(u)] ol dup ol dg, (2. 


The inner integral can be expressed in terms of y,(¢) in accordance with 
(148), and we arrive at the formula 


— tut ~iut 
18 yy (t) = =P — wm ‘i ol dg, (2), 
which, by (150), is the same as (151). 

It follows at once from the convolution theorem that, if y,(¢) and y,(t} are 
expressible by an integral of type (131), ie. belong to the class P of positive 
definite functions, the same can be said of their product. It is immediately 
obvious that the linear combination ¢,y,(t) + c,p,(t) with positive coefficients 
also belongs to class P. Further, if y(t) belongs to P, y(t) belongs to P, viz, 
if g,(x) defines y(t), we have for y(t): 


oon 


9 (t= ua oY dg, (a) = ite ef dg, (a), 
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where g,(xz) = —g,(—). Thus | y(é) |* will belong to P and will be defined by 
the function 
+ 20 
h(a)= $ g(e—t) dg, (t). 


If 9,(x) is continuous, h(x) will be continuous, and we have, by what was proved 
in [26): 
M {|v (t) |?}=0. 


Let us return to the function g(¢) defined by (131), and, as above, let a, be 
generalized Fourier coefficients corresponding to 4 = A,. On using (137) and 
the fact that the positive a, form a convergent series, we get 


lag = M {| 1 (#) Ph. (152) 


On the other hand, by what has been proved, 


M {| g(t)? } =0. (153) 
We have further: 


lM OP=lAO+eOPF=laOPF+eROF+LAOROD +e © orl). 


In view of the Buniakowski-Schwartz inequality 


: +o 2 1 +o 1 +o 
a [ana os | |p, (t) 2 dt- — | | 2 (t) |? de, 
(63) @ @ 

-o -o —-o 


together with (152) and (153), we can say that the right-hand side tends to zero 
on indefinite increase of w. Hence, M{9,(t) ¢,(¢)}} = 0, and similarly, 
M{9,(t) 9,(t)} = 0. Thus the following closure theorem follows from (152) 
and (153) for any function of class P: 


2) ak = M {| 9}. (154) 


29, The Cauchy -Stieltjes integral. We take the Cauchy integral over 
the entire real axis: 
+00 
w (2) = [ 2 ae. (155) 
Given certain assumptions regarding (x), this integral exists, and 
the function w(z) of the complex variable z is regular both in the 
upper and lower half-plane. These regular functions are different 
analytic functions in the half-planes, and we know that (x) can be 
expressed in terms of the jumps of «(z) on the real axis, i.e. the formula 
holds [IV; 85]: 
; ] . . 
y(x)= me 3,7 le (~ + t1) — w (x — TH)). 


FF 
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Let g(x) be a function of bounded variation in the infinite interval 
[—co, +°°], and let us form the Cauchy-Stieltjes integral for a 
complex value of 2: 


w(z) = | + dg (2). (156) 


The integrated function 1/(z — z) is continuous throughout the real 
axis and tends to zero as 7 — + cc. Thus integral (156) can be under- 
stood as a Stieltjes integral [4]. We shall prove the following inversion 
formula for integral (156): 


g(x) —g(0) = lim sa ( (o + tt) --w(o —ti)]do, (157) 
t>+0 F 


where half the sum of limiting values from the left and right has to be 
taken for g(x) at its points of discontinuity. We could have made a 
guess at this inversion formula, just as we did above for inversion of 
Fourier-Stieltjes integrals. 

Before proving (157), we consider the Poisson integral for the case 
of a half-plane. We put z = o + ti and separate the real and imaginary 
parts in the Cauchy kernel: 


1 z—oO t 


o—-2 (w7— oa)? +7? 1 (2 — 0)? + 7? - 
On separating the imaginary part in integral (155) and adding the 
factor 1/2, we in fact arrive at the Poisson integral for the half-plane: 


F (9,1) =— | qeayegge Y (@) de. (158) 
It is evidently a harmonic function both in the upper and lower 
half-planes. We shall prove the following as regards this integral: 
if p(x) is a bounded function, Riemann integrable in any finite interval 
(say a function of bounded variation g(x)), integral (158) (which evidently 
exists) tends to y(o) at points where w(x) is continuous and to half the 
sum of the limiting values from the left and right at points of discontinuity 
of the first kind when t+ tends to 0. This convergence ts uniform with 
respect to o in any closed interval of variation of t lying inside the interval 
where w(x) is continuous. To prove this, we first notice the obvious 
equation: 


~ | *_de=1. (159) 


a? +t? 
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We replace x in (158) by the new variable of integration y = x — a: 
i 
T 
F (g, == | parelyt+e)dy. 
We split the interval of integration into two: [—, 0], [0, +o], 
and introduce the new variable of integration y, = —y into the first 
of the integrals obtained. We thus arrive at the formula: 


F (0,1) = — [ ing PETE EPO ae, (160) 
0 


Suppose that o is a point where (x) is continuous, or a point of 
discontinuity of the first kind. We multiply both sides of (159) by 
[p(o + 0) + y(o — 0)]/2, and subtract term by term from (160). We 
thus obtain the equation 

yio+0)+y(o—0) 2 
LNG 2) ge = wi(x)dzx, (161) 
where 
g(a VO EO Vier). wien Vee 4iK2) 

Let « be a given positive number. There exists a positive 7 such 
that | w(x) | < <«for0 <2 < 7. If lies in a closed interval contained 
in an interval where y(x) is continuous, in view of the uniform con- 
tinuity of p(x), the number 7 is determined only by < and is independent 
of o. We split the interval of integration in (161) into [0, 7] and [7, ° ]. 
We have for the first interval of integration: 


n " oo 
2 Tt 2 t 2 Tt = 
6 6 i) 
As regards an inequality for the integral over the second interval, 
we observe that w(z) is bounded, as follows from the fact that p(x) is 
bounded, i.e. | w(x) | < L, where L is a positive number. We thus have 


oO 


ef eeeerelaect$ | id dx = =2 (F — are tan), 


a x? + 7? ett wz \2 


and we get the following inequality for the difference on the left-hand 
side of (162): 


| F (0,7) — veroree—") | <e+ 2(F —aretan 2). 
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The difference contained in the second term obviously tends to zero 
when the positive number t tends to zero, and this second term will be 
less than e for all t sufficiently close to zero. We thus have, for all t 
sufficiently close to zero: 

F(o,t) — uA aah aa ae <. Qe, 
from which our above assertion follows, since « is arbitrary. The 
closeness of z to zero is guaranteed by the value of », which does not 
depend on o for the intervals in which y(z) is continuous. Hence follows 
the uniformity of the convergence in these intervals of continuity. 
We now turn to the proof of the inversion formula (157). We form the 
function 
+00 
Tt 


[w (o + ti) — w (o — ti)] a | Tera pat 19 (#). 


Fy, (0,7) = 


274 


On integrating by parts, we can write 


Fy (6 Jas (oe ae (erarpe)ee= 


—oO 


ce 
=; joe a l@rerpe) 


Since g(x) is bounded, the improper integral written is uniformly 
convergent with respect to o belonging to any finite interval; by inte- 
grating both sides of the last formula with respect to o over the interval 
[0, %)], where the integration on the right-hand side is carried out 
under the integral sign, we get 


Xe 


: | [ole + 4) — 0 ( —1i)] do = 
6 


at 
1 +20 i -+ oo SNe 
Tt Tt 
ae { Goa a | par 9 (#) de. 

The integrals on the right-hand side are Poisson integrals, and we can 
apply the theorem proved above to arrive at inversion formula (157). 
This formula was first given by Stieltjes and is usually known as the 
Stieltjes formula. It must be remarked that the values of the function 
g(x) at the ends of the interval [—°c°, +°°]are not important for inte- 
gral (156), since the integrated function tends to zero as x tends to +°. 


CHAPTER II 


SET FUNCTIONS AND 
THE LEBESGUE INTEGRAL 


§ 1. Set functions and the theory of measure 


30. Operations on sets. We shall construct a more general type of 
integral by dividing the basic interval of integration into point sets 
of a general kind, instead of into the usual intervals. Moreover, a point 
set of a more general type will often be used instead of an ordinary 
interval as the basic domain of integration. The first section of the 
present chapter is devoted to a study of such general sets and to 
functions defined on such sets. We start with a study of the funda- 
mental concepts and facts in regard to sets consisting of any elements 
(i.e. not necessarily point sets). Some fundamental concepts and nota- 
tion, of which wide use will be made later, must first be introduced 
for genera] sets. We shall in fact primarily make use of point sets, i.e. 
sets whose elements are points of a straight line or a plane, or in gene- 
ral of a multi-dimensional space. 

If an element z belongs to a set A, this is written as z € A. If x does 
not belong to A, this is written as x € A. If all the elements appearing 
in a set A appear also in a set B, we say that A is part of B, and write 
Ac Bor B> A. Ifsets A and B contain the same elements, we write 
A = B. If all the elements appearing in A appear also in B, but there 
are elements of B which do not belong to A, we sometimes underline 
this last fact by saying that A is a regular part of B. If Ac B and 
Bc C, it follows that Ac C. Let 


As Ay Ase se’ (1) 


be sets, the number of which is finite or denumerable. The sum of sets 
(1) is defined as the set 2, whose elements are elements belonging to 
at least one of the sets A,. We denote a sum of sets by using the ordi- 
nary symbols 


@,=A,+4A,+... or €,= 3 4,. 
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The product of sets (1) is defined as the set 2, whose elements are the 
elements appearing in all the sets A,. A product of sets is written in 
the usual way as 


bh Med or Bes pl An 


n 
A product of sets may contain no elements. The set that contains 
no elements is called the empty set and is written as A. For instance, 
if A and B are sets having no common elements}, we have AB=A., 
Sums and products of sets are obviously subject to the commutative 
and associative laws. We have, for instance: 


A+B=B+A; A+(B+D)=(44+8)4+D; 
(2) 


The distributive law also holds, i.e. 


BS A, = > BA,.- (3) 


To prove this, we have to show that every element x appearing in a 
set on the left-hand side also appears in the set on the right-hand side, 
and vice versa. If x is an element of the set on the left-hand side of (3), 
it belongs simultaneously to B and to the sum of sets Aj, i.e. to at 
least one of sets A,. Let z € A,. Hencez € Band x € Ay, i.e. x € BAg, 
so that 2 belongs to the set on the right-hand side of (3). Conversely, 
if x belongs to this latter set, it belongs to at least one of the products 
BA,. Suppose that x € BA, ie. x € Band x € A,. Hence it follows 
that x € B and that z belongs to the sum of sets A), i.e. x belongs to 
the set on the left-hand side of (3), and this formula is proved. If 
Bc A, then obviously A + B= A. Hence it follows at once from (2) 
and (3) that 

(A+ B)(A+ D)= 4+ BD. (4) 

We now define a difference. We understand by the difference of 
sets A — B the set whose elements are elements of A not appearing 
in B. If Ac B, then A — Bis the empty set. It must be noticed that 
we do not assume that Bc A when defining the difference A — B. 
If Bc A, we have the obvious formula 

A=B+(A—B). 
We have in the general case 
A+B=B+(A—B). (5) 


+ If the sets A and B have no common elements, they are said to be 
disjoint [ Transl. ]. 
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We must mention some formulae which will be useful later. The proofs 
present no difficulty. If A — B = 8, and B — A = @,, then A + 8,= 
= B4+ @,. If 4,0 Bp, then 


D4nC SB, ond YB - SAC S(B— An) (6) 


The following formulae are connected with the concept of difference: 


A—(B—D)C(A--B)+D (7); AB=A—(A~—B); (7) 
(A, — A,) — (B, — B,) C (A, — By) + (Bz — A); (8) 
A+B=(4—B)+(B—A)+AB. (8,) 


We shall next establish the concept of a monotonic sequence of sets, 
and the concept of limit. If 3,, ®,, ... is a given infinite sequence of 


sets, and 
OG CO, Coe CS ind5 (9) 


we shall describe the sequence as increasing. A decreasing sequence of 
sets is defined by the condition that 


Oy. 65 Oe D.s (10) 


In case (9), the limit of sets #,, is defined as the set @ whose elements 
are elements belonging to at least one of the @,. We write this as 
& = lim @,. Notice that, in case (9), if an element z belongs to the 


N-y00 
set Z,, it belongs to all the sets 2, for n > k. We have the obvious 
formula in case (9): 


€= lim %,= SF, (11) 


nooo n=l 


which can also be written as 
$= lim %,=8,+ S Br — 80. (12) 
I-00 k=1 


Pairs of the sets appearing in the sum on the right-hand side of (12) 
have no common points [i.e. these sets are pairwise disjoint]. In case 
(10) we say that @ is the limit of sets 2, if its elements are elements 
belonging to all the 2,. We have in this case: 

= limé,= J] Z,, (13) 
n=1 


Ti- 00 
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and furthermore, we can write in case (10): 


O, =F + S (Fy — Fx41). (14) 

k=l 
The terms on the right-hand side of (14) are pairwise disjoint. 
We have defined the limit of a sequence of sets only in the case 
of a monotonic sequence. A general definition could have been given, 
but we shall not dwell on this, since it is of no use to us in what follows. 
A further concept must be introduced specially for point sets. Let 
& be a set of points on a plane. The set of all points of the plane not 
belonging to % will be termed the complement of &. This complemen- 
tary set is usually written as CZ. The following formulae for comple- 
mentary sets must be noted. If the concept of complement is applied 
twice, we obtain the original set, ie. C(C#) = %. If 8c @,, then 

CZ, > C@,. The further formulae: 


TT Fn.=€ SCF, (15) Pe = If % (18) 
>F,=CLT CF, (16) Cz, “08, = 8, — %,, (19) 
CJT t.= >, (177) €,~€,=6€,.C&, (20) 


are proved without difficulty. The concept of complement can 
obviously be brought in for a straight line, when the initial set is 
arranged on the line, or similarly, for any multi-dimensional space. 
The concept of complement with respect to a set A is sometimes 
introduced. If every point of the set f belongs to A, the complement 
of & with respect to A is defined as the difference 4 — %. We shall 
only use the concept of complement with respect to the entire space, 
i.e. with respect to a straight line, plane, etc. 


31. Point sets. We now introduce some ideas and results that relate 
specially to point sets. When discussing the theory of the multiple 
Riemann integral in Volume JI, some information was given in regard 
to point sets on a plane or in any n-dimensional space. We shall repeat 
what was said in Volume II with certain important additions. For the 
sake of definiteness we shall talk about point sets on the XY plane. 
Everything that is said may easily be extended to the case of a straight 
line or any n-dimensional space. 

Let us consider point sets on a plane referred to Cartesian axes XY. 
A set is said to be bounded if the distance of any point of it from the 
origin is less than a definite positive number JN, i.e. x? + y? < N? for 
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all points of the set. An e-neighbourhood of the pomt Pa, 6) is a 
closed circle with centre P and radius ¢, i.e. the set of points (a, y) 
satisfying the condition (x — a)? + (y — b)? < «. The point P is said 
to be a limit point or point of accumulation of the set Z if any «-neigh- 
bourhood of P contains an infinite set of points of 2. The point P 
itself may or may not belong to @. If all the limit points belong to @, 
@ is said to be closed. A point P of @ is described as an interior point 
if every point of an e-neighbourhood of P belongs to Z. The set & is 
said to be open if every point of it is an interior point. Closed sets are 
usually denoted by the letter / with various subscripts (the French 
word fermé = closed), and open sets by O (ouvert = open). 

The empty set is the ‘set’ that contains no points. Our future 
theorems may embrace the empty set, in which case it has to be 
regarded as both open and closed. The boundary of an open set O is 
the set J of points P with the following property: the point P does not 
itself belong to O, but any e-neighbourhood of P contains points of O. 
Since O consists of interior points, we can say that any e-neighbour- 
hood of P contains an infinite set of points of O, and the boundary / 
of an open set O can be defined as the set of limit points of O not 
belonging to O. It may easily be shown that the boundary of an open 
set is a closed set [II; 89]. 

Let ¥ be a set. We associate with it all its limit points and write % 
for the set obtained. This operation is known as closure of the set &. If 
& is a closed set, then # = &. Let us show that @ is a closed set. Let 
P be a limit point of @, ie. there is an infinite sequence of different 
points P,, (n = I, 2, ...) belonging to #, where P,, —» P. If there is an 
infinite set of points of among the P,, P is a limit point of #, and 
hence, by the closure process, appears in &. Now let all the P,,, as from 
a certain n, not belong to &. By hypothesis, they belong to 2, so that 
they are limit points of 7. In any e/2-neighbourhood of a point P 
there is an infinite set of points P,, and in any «/2-neighbourhood of 
each point P, there is an infinite set of points of 2. It follows at once 
from this that any e-neighbourhood of the point P contains an infinite 
set of points of 2, i.e. P is a limit point of , and must therefore belong 
to & by virtue of the closure process. 

We have thus shown that the set # obtained by closure of any given 
set Z is necessarily a closed set. It may be remarked that the whole of 
the plane is simultaneously an open and a closed set. We do not 
associate the points at infinity with the plane. Every finite set of 
points is a closed set. It can never have limit points. 
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We now bring in the concept of the distance between two sets. The 
distance between sets %, and @, is the strict lower bound of the dis- 
tance from all possible points of @, to points of %,. If the sets have a 
common point, the distance between them is zero. But the distance 
between sets can also be zero when they have no common points. 
Points of two sets that have no common points may in fact be 
indefinitely close. This cannot be the case if the sets are bounded and 
closed, and we proved the following theorem in Volume II: if 7, and @, 
are bounded and closed sets with no common points, the distance d 
between them is positive, and at least one pair of points, P of @, and 
Q of &,, can be found such that PQ = d. It follows at once from the 
proof of this theorem that it also holds when only one of the given 
closed sets is bounded. In particular, the distance of any given point 
of an open set from the boundary of this set is positive. 


32, Properties of closed and open sets. We shall now prove some 
special properties of closed and open sets. 

THEOREM 1. The sum of a finite or denumerable number of open sets 
is an open set. The product of a finite number of open sets is an open set. 

We take the sum of a finite or denumerable number of open sets: 


& = 3 0,. 


If P € @, then P belongs to at least one of the O,. Let P € Ox. 
Since O, is an open set, an e-neighbourhood of P also belongs to Ox. 
This e-neighbourhood of P also belongs to the sum %, whence it follows 
that % is an open set. We now take the finite product 


m 
G = lf O. 
n=l 


and let P belong to &. We show, as above, that an e-neighbourhood 
of P also belongs to &. Since P belongs to @, P belongs to all the O, 
(k = 1, 2, ..., m). Since the O; are open sets, there exists for any O;, 
an e,-neighbourhood of P belonging to O,. If the number « is taken 
equal to the least of the e, (k = 1, 2, ..., m), the number of which is 
finite, the e-neighbourhood of P will belong to all the O,, and con- 
sequently to 2. Notice that it is not permissible to assert that the 
product of a denumerable number of open sets is an open set. 

THEOREM 2. The set CF is open and the set CO is closed. 

Let us prove the first assertion. Let P belong to CF. We have to 
show that an e-neighbourhood of P belongs to CF. This follows from 
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the fact that, if there were points of F in any e-neighbourhood of the 
point P, P, which does not belong to F by hypothesis, would be a 
limit point of F and, since F' is closed, must belong to Ff, which implies 
a contradiction. 
THEOREM 3. The product of a finite or denumerable number of closed 
sets is a closed set. The sum of a finite number of closed sets is a closed set. 
Let us show, for instance, that the set 


g= fi [ va n 
n 
is closed. On passing to the complementary sets, we can write [30] 


CF = SCF,. 


By Theorem 2, the CF,, are open sets, and by Theorem 1, the set C& 
is also open, so that its complementary set 2 is closed. Notice that 
the sum of a denumerable number of closed sets may not be a closed set. 

THEOREM 4. The set O — F is an open set and F — O is a closed set. 
The following equations are easily verified: 


O—F=0-CF; F-—-O=F.-CO. 


Theorem 4 is a consequence of these, in view of the previous 
theorems. 

We shall say that a set is covered by a system J of sets if 
every point of & belongs to at least one of the sets of system M. 

THEOREM 5 (Borel). If a closed bounded set F is covered by an infinite 
system a of open sets O, we can extract from this infinite system a finite 
number of open sets which also cover F. 

We use reductio ad absurdum, i.e. we assume that there is no finite 
number of open sets of system a that covers F and hence arrive at a 
contradiction. Since F is a bounded set, all the points of F belong to 
some finite two-dimensional interval 4, (a <x <bj;c<y<d). We 
split this closed interval 4, into four equal parts, by halving the intervals 
[a,b] and [c,d]. Each of these four intervals will be taken to be clesed. 
The points of F which fall into one of these four intervals will form a 
closed set by virtue of Theorem 2, and at least one of these closed sets 
cannot be covered by a finite number of open sets of the system a. 
We take the closed interval of our four for which this is the case. We split 
this interval into four equal parts and repeat the argument. We thus 
obtain a system of embedded intervals 4), 4,, 4,, ..., each successive 
member of which is a quarter of the preceding one, and the following 
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holds good: the set of points of F belonging to 4, cannot be covered 
by a finite number of open sets of the system a for any k. As k increases 
indefinitely, the intervals 4, shrink indefinitely to a point P, which 
belongs to all the A,. Since 4, contains, for any k, an infinite set of 
points of F, the point P is a limit point of /,1.e. P belongs to F, since 
F is a closed set. The point P is therefore covered by some open set O’ 
belonging to the system a. An e-neighbourhood of P will also belong 
to the open set O’. With sufficiently large values of k, the intervals 4, 
fall inside the above-mentioned «-neighbourhood of P. These A, will 
therefore be entirely covered by the single open set O’ of system a, and 
this contradicts the fact that the points of F that belong to 4, cannot 
be covered by a finite number of open sets of a for any k. The theorem is 
therefore proved. 

THEOREM 6. An open set can be expressed as the sum of a denumerable 
number of semi-open intervals, pairs of which have no common points 
(i.e. the intervals are non-overlapping). 

We recall that a semi-open interval on the plane is a finite interval 
defined by inequalities of the forma >a>b;c>y>d. 

We draw a net of squares on the plane with sides parallel to the 
axes and length of side unity. The set of these squares is a denumerable 
set. We choose those squares, all the points of which belong to the 
given open set O. The number of such squares may be finite or de- 
numerable, or there may be no such squares. Each of the remaining 
squares of the net is divided into four equal squares, and from the new 
squares obtained we again choose those, every point of which belongs 
to O. Each of the remaining squares is again divided into four equal 
parts, and the squares chosen, every point of which belongs to O, 
and so on. We show that every point P of the set O falls into one of 
the chosen squares, all the points of which belong to O. In fact, let d 
be the positive distance of P from the boundary of O. When we arrive 
at squares whose diagonals are less than d, we can obviously say that 
P has already fallen inside a square, every point of which belongs to O. 
If the chosen squares are regarded as semi-open, any pair of them will 
have no common points, and the theorem is proved. The number of 
chosen squares must be denumerable, since the finite sum of semi-open 
intervals is clearly not an open set. On writing 4, for the semi-open 
squares which we have obtained as a result of the above process, we 
can write 


(21) 
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In the case of one dimension, i.e. a straight line, the following 
statement is readily proved: every open set on a straight line is the 
sum of a finite or denumerable number of non-overlapping open 
intervals. 

Everything that has been said in the last two sections is applicable 
to point sets on a straight line, in three-dimensional space and in gener- 
al in n-dimensional space. The only difference is in the definition of 
e-neighbourhood and interval. In three-dimensional space an e-neigh- 
bourhood of a point P is a sphere with centre at P and radius «, and 
an interval is a rectangular parallelepiped, the ribs of which are parallel 
to the axes. A semi-open interval is defined by the inequalities: 
aA<¢e¢<cbhja<y <b; a,<z< b,. In the case of a straight line 
an e-neighbourhood of a point x, is defined by the inequality x, — 
—eqrcrt ee. 


33. Elementary figures. A fundamental role will be played in what 
follows by finite semi-open intervals, and for brevity we shall speak 
of these simply as intervals. Let G(4) be a non-negative, additive and 
normal interval function. Our problem is to extend it to a wider 
class of point sets whilst preserving all its previous properties. We 
shall call the sum of a finite number of non-overlapping intervals 
A, (k = 1,2,...,m) an elementary figure. Using & to denote such 
an elementary figure, we can write 


R= > Ak: (22) 
k=1 


We can evidently use a different method to split this elementary 
figure into non-overlapping intervals: 


anit 
R= YAj. (23) 
k=1 
It is easily seen that we have for any two such subdivisions: 
m m 
GA) = SE (dj). (24) 
k=] k=1 


To see this, we need only carry out a new subdivision of #2, consisting 
of the product of (22) and (23), and recall the fact that G(A) is additive. 
Jt turns out that both the left and right-hand sides of (24) represent 
the sum of the values of G(4) for the intervals of the new subdivision. 
To obtain the left-hand side of (24), we only need to regroup the terms 
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of this latter sum that correspond to the sub-intervals which belong 
to the same 4,, whilst the right-hand side of (24) is got by carrying out 
the grouping for terms corresponding to sub-intervals belonging to the 
same 4;. Thus, if the elementary figure # is split by some method 
into non-overlapping sub-intervals, the sum of the values of the 
function G(A) for these sub-intervals has a completely determinate 
value, i.e. does not depend on the method of subdivision of FR. This 
sum will be taken as the value of the function G(R) for the elementary 
figure R, i.e. 


m 
G (Rk) = SG (A,) (25) 
k=l 
with any subdivision of # into a finite number of non-overlapping 
intervals. We have thus extended the function G(4) very simply to 
elementary figures. By making use of (21), we could similarly have 
extended G(4) to all open sets. However, we shall adopt a rather 
different procedure. At the same time, open sets will play a fun- 
damental role in our treatment. In the present section, we shall 
consider some further simple properties of intervals and elementary 
figures. 

Notice that, if R,c R,, then G(R,) < G(R,). This follows at once 
from the fact that G(A) is non-negative, if we make use of a division 
of #, into intervals such that sub-intervals having points in common 
with 2, are wholly contained in R,. Let 6, (k = 1, ..., p) be intervals 
which may overlap. If we produce the straight lines on which the 
sides of the 6, lie, we split the sum of intervals 6, into sub-inter- 
vals that have the following property: if two of them have a 
common point, they are entirely coincident. On reckoning super- 
imposed intervals as a single interval, we obtain an elementary figure 


Pp 
Ry, which is obviously the sum of intervals: yp = |> 0, where we have 
k=l 


G(R) < SG (6x), (26) 
k=l 
and the < sign holds if the value of the function G(A) is positive for at 
any rate one of the superimposed intervals. 

We now introduce a new concept which will be useful in later con- 
structions. Let 4 (a<«x<b,c<y<d) be an interval and a a 
positive number. We call the interval defined by the inequalities 
ata<2<b,c+a<y<d an a-compression of the interval 4, 
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and write it symbolically as “4. We define an a-expansion of 4 as the 
interval defined by a <4<b-+a, ¢ <y <d-+a, and write it symbolic- 
ally as A“. The differences 4 — 4 = R and 4” — A= RP are 
elementary figures. Since G(A) is non-negative, we have 


G(R) = G(A) — G(MA) > 0 and G(R) = G (A) — G(A) > 0, 
and it follows at once from the normality of G(A) that 
lim G(@A)= lim G(A4@)=G(A). (26,) 


a—>+0 a—+>+0 
We now prove a lemma which will be needed later. 
Lema. If the elementary figure R is covered by a finite or denumer- 
able number of intervals 6; (which may have points in common), then 


2G (54) > G(R). t (27) 


The assertion of the lemma can be seen by inspection. We obtain a 
rigorous proof by using Theorem 4. Let ¢ be a given positive number. 
We split R into a finite number of non-overlapping intervals 
A, (k = 1,2,...,m), and subject each sub-interval 4, to an a- 
compression, the positive number a being fixed so small that the 
sum of the values of G(A) is > G(R) — e for the compressed inter- 
vals. On writing R, for the elementary figure equal to the sum of the 
compressed intervals, we can write 


G(R) > G(R) —e. (28) 


Each of the intervals 6; appearing in the covering of F is subjected 
to an a,-expansion, the positive numbers a; being chosen so small that 


(8,9) < G (64) + 3 - (29) 


The compressed intervals, the sum of which has given R,, are made 
into closed intervals, i.e. we close each of these intervals. The sum 
of the closed intervals obtained (the number of which is finite) is a 
closed set F, where obviously # CR. If we exclude the boundary from 
the 6; i.e. two sides and one vertex, an open interval 6” remains. On 
recalling the extension of the intervals 6;, we can say that the open 
intervals 6°» cover the above-mentioned closed intervals, i.e. cover 
the bounded closed set F. For instance, let the number of intervals 
6; be infinite. By Theorem 4, it is sufficient to take a finite number of 
the intervals 6; (k = 1, 2, ...,¢q) to cover F and hence cover R,. 


m 
¢ 2 means 2, where m = o or is > 1 and finite. 
k 
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The sum of the intervals § yf) (k = 1, 2,...,q) is an elementary 
figure R’, where R,c R’, and consequently G(R.) < G(R’). We also 
have, by (26) and (29): 

q 9.1 


G(R’) = SG (G9) < SG (5) +e Sse, 


k=1 k=1 k=1 


whence it follows at once that 
q 
G(R.) < G(R’) < SG (5) + ¢, 


k=1 
so that all the more: 
G (By) < 2 G (dj) + €. 
On comparing with (28), we obtain 
G(h)—-e< FG(6,) +e, ie. DG (d,) > G(R) — 2e. 


The sum on the left is independent of ¢, and in view of the arbitra- 
riness of e, we obtain inequality (27). Notice that the terms of the 
sums on the left of (27) are non-negative and finite, whilst the sum 
itself may be equal to (++). 

We shall often be concerned in future with sums of an infinitenumber 
of non-negative terms. If at least one of the terms of such a sum is 
equal to (+0°°), the total sum must be reckoned equal to (+). 
But, as we have just indicated, it may happen that all the terms are 
finite, whilst the sum is equal to (+ °°), ie. the series is divergent. 


34, Exterior measure and its properties. We now use the function 
G(A) to associate any point set on the plane with a non-negative 
number, which will be termed the exterior measure of the set. 

Derinition. Let a set be covered by intervals A, (n=1,2,...), 
the number of which is finite or denumerable. The exterior measure of & 
is the strict lower bound of the values of the sums: 


a G (A,) (30) 


for all possible coverings of by intervals. We shall denote the exterior 
measure by the symbol | & |g, where the subscript indicates the function 
G(A) on which the definition of exterior measure is based. Thus, we can 
write for any covering: 


> G(A,) >|F |g and |F |g = inf J G(4,). (31) 
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If sums (30) are equal to (+e) for any covering, the exterior 
measure has to be regarded as equal to (-+-°c). The exterior measure 
of a bounded set is always finite, since such a set can be covered by a 
single interval 4), and G(A,) is finite by hypothesis. Notice that an 
unbounded set @ cannot be covered by a finite number of intervals, 
since we have agreed to take each interval as finite. Nevertheless, the 
exterior measure of an unbounded set may be a finite number. We 
shall now prove a number of theorems on exterior measure. 

TueoreM 1. If &’c &", then |B’ |g < | &" la. 

Every covering of &” is a covering of 2’, so that the lower bound 
of sums (30) for 2’ may be less than for ”, but can never be greater 
than for &", which is what we had to prove. 

THEOREM 2. The exterior measure of every elementary figure R is 
equal to G(R), ie. | R lg = G(R). 

If RF is divided by some method into sub-intervals 4,, these latter 
cover # and we get | RB |g < G(R), by (25) and the definition of ex- 
terior measure as the lower bound of sums (30) forall possible coverings 
of R. We now prove the reverse inequality. If the intervals 4), cover R, 
we have 2 G(A;) > G(R) by the lemma of the previous section, 

at 


whence it follows immediately that | F |g > G(R). These two inequalities 
together give us | & |g = G(R). 

THEOREM 3. The exterior measure of the sum of a finite or denumerable 
number of sets is < the sum of the exterior measures of the individual 
sets, i.e. 


Ph er (32) 


We shall often use the single letter S in future to denote the covering 
of a given set. In this case, we write o(S) for the sum (30) for such a 
covering. Given a positive e«, by the definition of strict lower bound, 
there exists a covering S,, of the set 2, such that o(S,) < | @n |e + 
+ «/2". We take the intervals appearing in all the S, (n = 1, 2, ...). 
They effect a covering S for © @,, and we obviously have for this 
covering: i 


o(8)= So(8,) < d|Grlo+e Son <> 


Erle +é- 


By the definition of strict lower bound: 
[2 en ow aw laala + & 


and, since ¢ is arbitrary, we in fact obtain inequality (32). Nctice 
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that the sum on the right-hand side of (32), or even an individual 
term of this sum, can be equal to (+ °°). Since the terms are non- 
negative, their order is of no importance. Now that the exterior 
measure has been defined we have extended the function G(4) to 
all possible point sets on the plane, but, generally speaking, we have 
lost the additive property of this function. For it can be shown that, 
if pairs of the sets %, have no common points, we can nevertheless 
have the < sign in certain cases in (32). Below, we shall distinguish 
the class of sets for which the exterior measure retains the property 
of additiveness. As a preliminary, we prove a further theorem on 
exterior measure. 

THEOREM 4. Every set f can be covered by an open set O whose exterior 
measure differs by as little as desired from the exterior measure of @, i.e. 
if f is any set and e is any given positive number, there exists an open 
set O such that @c O and |O|g <|@|g+e. 

If |Z |g = +, the inequality |O |g <|% |g + is satisfied for 
any covering of the set @ by an open set. We shall assume in future 
that | % |g is finite. Let « be a given positive number. We choose the 
covering of by intervals 4, such that the inequality holds: 


SG (An) <|F lo +> - (33) 


Each of the intervals A, is subjected to an a,-expansion, and the 
positive numbers a, are chosen so that 
€ 


G (AG”) < G(4,) + Qari ° (34) 


If we exclude the boundary from 4‘, i.e. two sides and one vertex, the 
sum of the open intervals A» obtained gives some open set O, which 
is obviously covered by intervals A”. We have by the definition of 
strict lower bound: 


|Ole << SG (A). 


We use (34) to write further: 
|Olg < > GF (An) +, 


and finally, on taking (33) into account, we get 


Ole <|Fle+ztz=lFlete- 
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35. Measurable sets. We now distinguish a class of sets which will 
be described as measurable, and for which we shall later prove that 
the exterior measure is additive. We shall refer to the exterior measure 
of a measurable set as simply the measure. 

DEFINITION. A set & is said to be measurable if it can be covered by an 
open set O such that the exterior measure of the difference O — & is as 
small as desired, i.e. € is said to be measurable if, given a positive e, 
there exists an open set O such that @c< O and |O—@ |g< e. The ex- 
terior measure of a measurable set will be referred to simply as the measure 
of the set. 

The requirement imposed in the definition of measurable set is 
stronger than the property stated in Theorem 4. This latter property 
holds for all sets, whilst, given a certain choice of G(A), there exist 
sets which are not meagurable, i.e. which are not subject to the above 
definition. We can use the symbol | @ |g to denote the measure of a 
measurable set, since the measure of a measurable set coincides by 
definition with exterior measure. We prove next that any interval 4 
is measurable. Its exterior measure is equal to G(4) by Theorem 1. We 
shall see later than every elementary figure is measurable. We are 
therefore justified in simply writing G(@) for the measure of any 
measurable set 2. We further agree to take the exterior measure, or 
simply the measure, of the empty set as zero. This is in accordance 
with our above definitions. We introduce one further definition: a set 
@ is called a set of measure zero with respect to G(4), or simply a set 
of measure zero (since G(A) is assumed fixed), if | 7 |g =O. It follows 
at once from this definition that every part of a set of measure zero is 
also a set of measure zero. We now prove a number of properties of 
measurable sets. These properties will serve as a basis for the whole 
of our future treatment. 

THEOREM 5. An open set is measurable. 

If 2 is an open set, it can be shown to be measurable simply by 
taking O as coincident with @. In this case |O — % |g =O. 

THEOREM 6. Any interval A is a measurable set, and its measure is 
equal to G(A). 

Let A be an interval. On subjecting it to an a-expansion, we obtain 
the interval A“. The difference 4 — A is an elementary figure, 
and since G(A) is normal, we have G(A“ — A) = G(A™) — G(4) > 0 
as a—-+0, ie. given any positive e, there exists an a such that 
aga” — A) < e«. Let O be an open interval A™, ie. an interval 
which is obtained from A“) by the removal of its boundary. In view 
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of the expansion process, 4c O. To prove the measurability of J, it 
remains to show that |O — 4|g <«. We have Oc 4, and by 
Theorem 1, we have |O— A|g < | A — Alg = G(A’ — A) < «, 
ie. | O—A |g < e, which is what we had to prove. The measure of 4, 
equal to the exterior measure of 4, is the same as G(J4), since J is a 
particular case of an elementary figure. 

THEOREM 7. The concept of a set of measure zero, i.e. a set with an 
exterior measure equal to zero, is the same as the concept of a measurable 
set whose measure is zero. 

If | F |g = 0, there exists by Theorem 4 an open set O such that, 
given any positive ce, 2c O and | @ |g < «, ie. by Theorem I, all the 
more |O — @ |g < ¢, ie. 8 is measurable. The measure of Z is equal 
to its exterior measure, i.e. is zero. Conversely, if 7 is a measurable 
set and its measure is zero, by the definition of measure the exterior 
measure of @ is also zero, and the theorem is therefore proved. 

THEOREM 8. The sum of a finite or denumerable number of measurable 
sets ts a measurable set. 

Let %, be measurable sets, # their sum, = 2 @,, and e a given 
positive number. By the definition of measurable set, there exist open 
sets O, such that %,c O, and | 0, — &, | < «/2". The sum of the 
open sets O, is an open set O, where obviously 2c O, and by (6) of 
[30], we have 

O-€®cy|0, —@,]. 
n 


By using Theorems 1 and 38, we obtain 
|O — Bla S| 2 (On — Bn) lg < 21 On — Fale 


or, since |O, — ®,| < ¢/2”: 
|O-8|g<e, 


which proves the measurability of 7. We now turn to a proof of the 
measurability of closed sets. We must first prove a lemma. 

Lemma. If the distance between two sets F, and &, ts given by a positive 
number, then | 6, + 2a =| 81 1a + | Fs le. 

Let d be the positive distance between sets , and %,. Given any 
positive «, there exists a covering S of the set 7, + #, such that 


a(S) <|@, + Fete. (35) 


We divide each of the intervals appearing in S into a finite number 
of intervals in such a way that the diagonals of all the intervals ob- 
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tained are less than d. All the intervals of S are now divided into three 
classes: the first class contains the intervals which cover only points 
of ,, the second contains intervals which cover only points of ,, and 
finally, the third contains intervals that cover neither points of 2, 
nor points of @,. There are no intervals that cover both points of Z, 
and points of 2,. We can simply throw the intervals of the third class 
out of the division of S. The sum o(S) can now only diminish, and 
inequality (35) retains its force. We can therefore assume that the 
covering 9 is split into coverings S, and S,, where the intervals of S, 
cover @, and have no common points with @,, and the intervals of 9, 
cover 2, and have no common points with @,. We have o(8) = 
= o(S,) + o(S;) and, by 

a (8,) + 0(8,) <|%,+ Fale +e. (36) 


It follows from the definition of strict lower bound that | 7; |¢ < 
< o(8,) and | &, |g < a(S), and inequality (36) leads to the inequality 
\Filet+|Fele<1%,+ 82 le + €, whence, since « is arbitrary, we 
have |@,|¢+|@2le <|%, + &, |g. On the other hand, Theorem 3 
gives |%,+ @.le <1%1 lo +|22lg. We therefore obtain | 2, + 
+ 8s le =|%1!e + | %2\e, which proves the lemma. 

Corotiary. If F, and F, are tuo disjoint closed sets at least 
one of which is bounded, then | PF, + Foyle =| Fy le +| Fo le- 
If Pe (k = 1,2, ...,m) are pairwise disjoint closed bounded sets, then 


| s fy ee z | F, |g. This corollary is proved simply by using what 
k=1 
was said in [32] regarding the distance between closed sets with no 


common points. 

THEOREM 9. Closed sets are measurable. 

We first suppose that / is a bounded closed set, and let ¢ be a given 
positive number. By Theorem 4, there exists an open set O such 
that Fc O and |O|g < | F lg + ¢. We show that this open set O 
will in fact satisfy the inequality 


|O—Flg<e, (37) 


which appears in the definition of measurable set. By Theorem 3 of 
[33], the difference O — F is an open set, so that, by Theorem 5, 
it can be written as the sum of a denumerable number of non-overlap- 
ping intervals A,: 


O-F= S4,,. (38) 


n=l 
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Having fixed the positive integer m, we consider the sum of the first 
m terms of sum (38), where each interval appearing in this sum is 
subjected to an a-compression by choosing the positive number a in 
some given way. We thus obtain an elementary figure R: 


m 
k= > A. 

n=l 
If we close each interval ‘A, (n = 1,2, ...,m), the sum written 
gives us a closed set, which obviously coincides with the closure & 
of the elementary figure R. Each closed interval “4, is covered by a 
corresponding interval 4, forming part of sum (38). Hence the closed 
set # has no points in common with F, so that the distance between 
these sets is positive. The distance between F and F will be all the more 

positive, and we have by the lemma: 


1G 


n=l 


m 
2044] + |F le. 
n=l G 


m 
But, by (38), 5S 4, + Fc O, so that 
nel 


m 
34 +|Fle<|Olo- 
n= G 


On taking into account the inequality | O |g < | F |g + «, we obtain 
from this: 


24, +|Flo<|Flote. 
n=l G 


By hypothesis, F is a bounded set, so that | ¥ |g is a finite number. 
The last inequality leads us to 


SOAd,| <e. 
: n=l G 
ut 
m j m m 
24, |= GB) = 56 (4,) = S|©Ag le 
n=1 nel n=l 


and the previous inequality can be rewritten as 


m 


> | Ane < Ee 


n=l 
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On first letting a tend to zero, then m to infinity, we get the in- 
equality 


1a le <e. 
n=1 
Finally, it follows from (38) and Theorem 3 that 
|O-—Fla< Sl4ile<e 
n=1 


i.e. inequality (37). Now let the closed set # be unbounded. Let y, be 
a closed circle with centre at the origin and radius n. We can form 
the closed bounded sets F,, = F - yp, and write 


and the measurability of F follows at once from Theorem 8. Theorem 9 
is thus proved. 

TurorEM 10. If & is a measurable set, the complementary set CZ is 
measurable. 

Since ¥ is measurable, there exist open sets O, such that dc O, 
and | 0, — & |g < 1/n. We construct the closed sets F, = CO,. Since 
& Cc On, we have F,,c Cé@, and in addition, by (19) of [80], we can 
write the equation: CZ — F, = O, — @. On replacing F,, on the left- 
hand side by the sum of all the F,, we get 


Cs — SF, cO®—F,, ie. CF ~ SF, c0,—F, 
n=l n=l 
and, since | 0, — @ |g < 1/n, we have 


jos — 57 


n=l 


1 
<n 
G 


The left-hand side of the inequality written is independent of n, and 
by letting n tend to infinity, we arrive at the equation 


=0, 
G 


Ce — SF, 
n=l 


from which it follows that the difference on the left is a set &, of mea- 
sure zero. We can thus write CZ as a sum of measurable sets: 


OF =F,4+ SF,, 


n=l 
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whence it follows, by Theorem 8, that C% is measurable. According 
to the definition, the measurability of a set is established with the aid 
of open sets. We show in the next theorem that the measurability of 
a set can be similarly established with the aid of closed sets. 

THEOREM 1]. The necessary and sufficient condition for & to be 
measurable is that, given any positive e, there exists a closed set F such 
that Fo € and |F — F\g <e. 

The measurability of # is equivalent to the measurability of CZ, 
and the necessary and sufficient condition for this is that, given any 
positive e, there exists an open set O such that C? c Oand | O — C@ |g 
< «. If we put F = CO and notice that, by (19) of [80], O —C& = 
=& —CO=4@ — F and Fc @, we obtain the assertion of the 
theorem. 

THEOREM 12. The product of a finite or denumerable number of measur- 
able sets is a measurable set. The difference between two measurable sets is 
a measurable set. 

If the sets 2, are measurable, the measurability of their product 
follows directly from the formula [30]: 


UF, =C S08, 
n n 
and Theorems 10 and 8. If A and B are measurable, the measurability 
of their difference follows at once from the formula of [30]: A — B= 
= A+CB and the measurability of the product. 

THEOREM 13. The measure of the sum of a finite or denumerable number 
of pairwise disjoint measurable sets is equal to the sum of the measures 
of the individual sets. 

Let %, be pairwise disjoint measurable sets. The measurability of 
their sum follows from Theorem 8. We suppose first that each of the 
-@, 18 bounded. By Theorem 11, given any positive ¢, there exist closed 
sets Ff, such that F,c %, and |%,— Fnlg < ¢f2”", where the F, 
are obviously bounded and pairwise disjoint. The formula 
$, = F,+(&n — Fr) implies at once that 


|Fnle < [Fala + or- 
On the other hand, by considering the first m of sets F,, we find that 


< 


m 
> Frc >, and consequently 
n G 


n=1 


we Pn 
n=1 


Pe 


G 
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We can apply our above lemma to a finite sum of pairwise disjoint 
closed sets F’,, and hence obtain, on also making use of the inequality 
| Fla > | @n le — 2", 


m m m 
Den) > DlPrle> SlFale— Sar: 
n G n=1 n=1 n=l 


Let us take the most complicated case, when the number of sets 
@,, is infinite. On indefinitely increasing the number m in the last 
inequality, we have 


ae 
n=1 


or, since « is arbitrary: 


> 2 Fn la — é, 
G n=l 


xe, 


n=1 


> > | e, le : 
G n=l 
On comparing this inequality with (32), we arrive at the equation: 
yen 
n=l 


which proves our theorem. In view of the measurability of Z, and 
their sum, we can write the last formula as 


2 18 la, (39) 
G n=1 


GG, +8, 48,4 ...)=G(8)+G6(G,) + E(%) +... (40) 


We now take the case when the @,, include unbounded sets. Let y,, 
be a closed circle with centre at the origin and radius n. We take the 
sets 

GM =F,%; GP =F, (¥.—v)i GP =F, (v3 — M2); --- 

Each of them is bounded, and they are all measurable, since the 
closed set y, and the difference between closed sets yy, — yx, are 
measurable sets, whilst the product of measurable sets is also measur- 
able. We can write each of the 2, as a sum of pairwise disjoint bounded 
measurable sets: 


F,= Ss, 
and, by what has been proved, we have 


|Fnle = >| FMP Ic. (41) 
k=1 
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The sum @ of sets %,, can be rewritten as a double sum of pairwise 
disjoint bounded sets #“, certain of which may be empty: 


By what has been proved, we have 


[Fla= > DSF Ic. 
n=l k=l 

Since the terms are non-negative, their order is of no importance 
[I; 134]. We shall sum first over k, then over n. On using (41), we thus 
arrive at (39) again, and the theorem is fully proved. 

Note. If we dispense with the assumption that no pair of the @,, 
has common points, we have for the sum of the 2, which is measur- 
able by Theorem 8, 


G(s) < SG(é,). (40)) 


This follows at once from Theorem 3 and the fact that the exterior 
measure is simply the measure for measurable sets. If the measure is 
zero for all the @,, (40) gives G(Z) < 0. But the measure cannot be 
negative, therefore G(Z) = 0, i.e. the sum of a finite or denumerable 
number of sets of measure zero is a set of measure zero. 

All the theorems proved above are also valid for measurable sets 
of infinite measure. We have to make a proviso in this connection in 
the last theorems. 

THEOREM 14. If A and B are measurable, Bc A, and B is of finite 


measure, then 
G(A— B)=G(A)—G(B). (42) 


The difference A — B = D is measurable by Theorem 12. We have 
A= B+ D, where B and D have common points. By Theorem 13, 
G(A) = G(B) + G(D), and subtraction of the finite number G(B) 
from both sides gives us (42). 


THEOREM 15. If ®, (n = 1, 2, ...) is a non-decreasing sequence of 
measurable sets, the limit set is measurable and 
G(#)= lim G(Z,). (43) 
N—-poo 


The measurability of Z follows at once from the formula 


€= lim %,=8,4+ (%,—%,) + (@,—,) +... (44) 


N+ 
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The terms on the right have no common points, and if all the 7, 
are of finite measure, we have 


G (%) = G (8) + [G (&,) — G (@,)] + [4 (3) — 4 (@,)]. 


The sum of the first » terms on the right is equal to G(&,), ie. (43) 
follows from the last formula. If one of the @, is of infinite measure, 
the limit set is all the more of infinite measure, and (43) is obvious. 
Notice that the value (+°°) is permissible in this formula, both 
for G(@,) and for G(@). 

THEOREM 16. If 2, (n = 1, 2, ...) t8 @ non-increasing sequence of 
sets of finite measure, the limit set is measurable, and (43) holds. 

We write 2, as a sum of pairwise disjoint sets: 


$= 8 + (@,—&,) + (@, —F3) + (Fs -F) + .-- (45) 


The measurability of Z follows from Theorems 8 and 14. On applying 
Theorems 13 and 14 to (45), we get 


G (8) = G(&) + [4 (%, — G (F,)] + [G (F,) — G (®s)] + 
+ [G(%s) —G(&)] + - 


G(@,) =G(%)+ G(é,) — lim G(@,), 
whence (43) follows. 
Note. The measurability of the limit set @ follows from (45) 
without the assumption that the @, are of finite measure. 


36. Measurable sets (continued). The above theorems on measurable 
sets have a number of useful corollaries. An elementary figure & 
is the sum of a finite number of intervals, i.e. is a measurable set, and 
its measure (which is the same as its exterior measure) is given by (25), 
where A, (k = 1, 2, ...,m) is some division of BR into non-overlap- 
ping intervals. Let Lg denote a family of measurable sets, where 
the subscript G indicates the function G(4) used as a basis for forming 
the family. We have extended G(A4) to all the sets # belong to Lg, 
the function G(@) obtained being non-negative and, by Theorem 13, 
additive for a denumerable as well as a finite number of disjoint sets 
&. Let @, be a vanishing sequence of sets, belonging to Lg and 
having finite measure, ie. 2,5 0,5 0,5 ..., and the limit set 
& of the &, is the empty set. It follows at once from Theorem 
16 that G(%,) > O, ie. the function G(#) is not only non-nega- 
tive and additive, but is normal for the family of sets Lg. In order 
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to underline its additiveness for a denumerable as well as finite number 
of sets appearing in Lg, we shall call this function completely additive. 
The family Lg also contains unbounded sets. Certain of these may be 
of finite measure, whilst the measure of the others may be (+ °). 
But evidently, not every unbounded set needs to be measurable. 
Often, when forming a family of measurable sets, only bounded sets 
are considered, or even sets belonging to a definite finite interval. We 
shall not subject ourselves to this limitation in our future treatment. 
A further point is that the initial function G(4) is assumed to be de- 
fined for all finite intervals. If G(A) is defined only for intervals A 
belonging to some interval 4), it can naturally be extended to all 
intervals A by using the formula G(4) = G(4 + A,), remembering that 
a product of intervals is also an interval. 

The family of sets Lg depends on the choice of initial function G(4). 
But whatever the choice of this function, it always contains all inter- 
vals, elementary figures, open sets and closed sets. We shall give later 
a fuller characteristic of the sets which belong to Lg for any choice 
of G(A). We shall interpret the set function as a mass. Specifying 
the original function G(4) amounts to specifying the mass on some 
interval 4, the usual conditions for non-negativeness, additiveness 
and normality being obviously fulfilled. A point set # is measurable 
if it is meaningful to speak of the mass located on @, and G(@) is this 
mass, 

We can give a simple example of when the set Lg contains all point 
sets of the plane. Let mass 1 be concentrated at the point P. Here, 
G(A) = 1 if the interval A contains P, and G(4) = 0 if A does not 
contain P. It is easily shown that the family Lg for such a function 
G(A) contains all sets, where G(@) = 1, if F contains P, and G(Z) = 0 
if € does not contain P. 

Let us take the important particular case when G(4) is equal to the 
area of the interval A. The family Zg will simply be written as L for 
this case. Here we have an extension of the concept of area for the 
wide family of sets LZ. It was this particular case that was considered 
first by the French mathematician Lebesgue. The function G(@) will 
be written as m(Z@) for this case. The family of sets L is usually known 
as the family of sets which are Lebesgue measurable. It is meaningful 
to speak of an area for such sets. If # is a finite or denumerable set of 
points, m(%) = 0. Similarly, if @ is a segment or the whole of a straight 
line, m(@) = 0. If we take the same interval as semi-open, open or 
closed, m(A) has in every case the same value. If a measurable set & 
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has interior points, obviously m(@) > 0. It can be shown that there 
exist bounded open sets such that m(l) > 0, where J is the boundary 
of a set (1 is closed and therefore measurable). For an open set, m(O) 
is the sum of the areas of the intervals which appear in (21), this sum 
being independent of the method of representing O as a sum of inter- 
vals. If F is a bounded closed set, on covering it with an open interval 
A,, we can define m(F) as the difference between the values of two 
open sets, viz. m(F) = m(4,) — m(A, — F). 

The whole of the construction of family Lg can be performed pre- 
cisely as above in any finite-dimensional space. In particular, the family 
L in three-dimensional space is the family of sets having a definite 
“volume”, whereas in one dimension it is the family of sets having a 
definite ‘“Jength’’. Instead of this, for spaces with a finite number of 
dimensions, we often speak simply of the measure of the set, if it 
belongs to L. 


37. Criteria for measurability. Various definitions, equivalent to the 
one above, can be given of measurable sets. We shall indicate some of 
these definitions, confining ourselves for the moment to bounded sets. 

THEOREM 1. The necessary and sufficient condition for a bounded set 
& to belong to the family Lg is that, given any positive e, there exists an 
elementary figure R such that 


F+e—R+e,, (46) 
where we have the inequalities for sets e, and e: 
le:lg <e and Jelg <e. (47) 


Necessity. Let belong to Lg. There now exists an open set O 
such that @c O and |O— @ |g < «. On writing O— F = e,, we 
have O = & + e,, and inequality (47) holds for e,. On the other hand, 
by Theorem 6 of [32], O is the limit of an increasing sequence of 
elementary figures R,, where #, is the sum of the first n terms on the 
right-hand side of (21). By Theorem 15, we have G(O) = lim G(£,), 

Mt» 0 


so that we can take so large an n = m that, on setting R= Rp, we 
have O= R+e,, where lesle <e. On comparing both the ex- 
pressions obtained for O, we arrive at (46), where inequalities (47) 
are fulfilled for e, and e,. Let us prove the sufficiency. Given any e, 
(46) and inequalities (47) hold. Since & is measurable, there exists an 
open set O, such that Rc O, and | O, — & |g < e. On the other hand, 
by Theorem 4 [34], there exists an open set O, such that e,c O, and 
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|O. |e < [le + €, or, by (47), we have | O, |g < 2e. The open set 
O = 0, + O, covers + e,, and we have 


O0—&cl[O—-(F+ea)} +e 
or, by (6) of [30]: 


O—€ c[(O,+ O,) —(R + ,)]} +e, C(O, — R) + (O, — eg) +e,. 


On observing that |O,— Rlg < e, | O, — @|g < | 0, |g < 2e and 
(47), we obtain from this: | O — @ |g < 4e, so that @ is measurable, 
in view of the arbitrariness of «. 

THEOREM 2. The necessary and sufficient condition for a bounded set 
S to belong to Lg is that, given any positive «, there exists an elementary 
figure R such that 


|\F—R|lg<eand |R—@lg<e (48) 


Let us prove the necessity. Let belong to Lg. There exists an 
elementary figure FR such that we have (46) and (47). Inequalities (48) 
follow from the obvious relationships  — Rc e, and R—@C e,. 
Let us prove the sufficiency. Given any e¢, let there exist an FR for 
which inequalities (48) are satisfied. If we put @— R=e, and 
kR—€@=4, we gett +e,=R-+e, where e, and e, satisfy in- 
equalities (47), and the measurability of @ follows at once from 
theorem 1. 

THEOREM 3. The necessary and sufficient condition for a set € (which 
may be unbounded) to belong to Lg is that, given any positive e, there 
exist an open set O and a closed set F such that 


Fc@coO and |0O—-Flig<e. (49) 


If # is measurable, by the definition and Theorem 11 [35], there exist 
an F and O such that Fc O, |F — Fl|g < 1/2e and |O—Fig< 
< 1/2e. We have further: O — F = (O — @) + (@ — F), whence (49) 
follows. Conversely, let (49) hold. Now, all the more |O — 8 |g < « 
and @ is therefore measurable by the definition. 

A further criterion for measurability will be given without proof. 
The necessary and sufficient condition for 8 to be measurable is that, 
for any choice of set A, we have 


|Alg=|4:-F|ge+|A4—E@ lg. (50) 


38. Field of sets. We bring in a new concept regarding families of 
point sets, a family of sets being understood to mean a system of sets 
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(a set of sets). We define a field of sets as a family of sets with the 
following properties: 

(1) if the sets 2, and 2, appear in the family, their difference J, — 2, 
appears in the family; 

(2) if #, and &, have no common points, and appear in the family, 
their sum 2, + &, belongs to the family. 

The following are immediate consequences of this definition. The 
empty set, whichis the difference between two identical sets belonging 
to the field of sets, must belong to any field of sets. Further, it follows 
at once from the formulae [30]: 


@,F,=6,—(€,—-G,); €,+%,=8, + (F, —&) 


that the product of two sets belonging to the field also belongs to the 
field, and the sum of two sets belonging to the field belongs to the field, 
even when the sets have common points. This statement can evidently 
be generalized to any finite number of factors or terms, i.e. the sum 
and product of a finite number of sets belonging to the field belongs 
to the field. 

We shall strengthen the requirement of the second part of the de- 
finition of a field of sets by requiring that the sum of a denumerable 
number of disjoint sets belonging to the field belongs to the field. 
Such a field of sets is described as closed. Thus, a closed field of sets 
is a family of sets with the following two properties: 

(1) if sets Z, and %, appear in the family, their difference 7, — &, 
appears in the family; 

(2) if the family contains a finite or denumerable number of disjoint 
sets %,, it also contains their sum. Precisely as above, it can be seen 
that a closed field of sets contains any finite sums and products of 
sets appearing in it. Let us show that a closed field of sets contains 
the sums and products of a denumerable number of sets appearing 
in it. To see this, we write down the following two formulae: 


SE, = 8, + (F.— Fy) + [Fs — (&, + &)] + 


n=! 
+ [@,—(@,+ 8,4 8)] +... (51) 
U7 ®. =, —((@,—&) + (@:— 82) + 
n=1 
+(@,-8)+..-]. (52) 


The proof of these formulae presents no difficulty. It is sufficient to 
verify that every element (point) appearing in a set on the left-hand 
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side appears in a set on the right-hand side, and vice versa. Let 2, 
appear in the closed field of sets 7’. The terms on the right-hand side 
of (51) are now disjoint, and appear in 7. Consequently, by the defi- 
nition of the closed field 7’, the sum of sets @, also appears in 7’. 
The terms in the square brackets on the right-hand side of (52) 
appear in 7’, so that their sum also appears in 7’. Thus the whole of 
the right-hand side, i.e. the product of sets 7, appears in 7’, which is 
what we had to prove. 

It follows at once from the theorems proved in [36] that Lg is a 
closed field of sets. We consider the interval function G(4), which 
we have extended to the clored field Lg. The family of intervals 
does not represent a field, since the difference between two intervals 
may not be an interval. The family of elementary figures £ is a field, 
though not a closed field. Our process for extending G(A) consisted 
in first extending G(A4) to the field of elementary figures, then to the 
closed field Lg. The function G(@) proved here to be non-negative, com- 
pletely additive and normal in Zg in the sense indicated in [387]. 
Let us explain the connection between the concepts of normality and 
additiveness for a set function. 

Let T be a field, which may be non-closed. The function G(Z), 
defined for all sets appearing in 7, is said to be completely additive 
in T when the following condition is fulfilled: if the set belonging 
to T is the sum of a finite or denumerable number of sets @,, that also 
belong to T and are mutually disjoint, then 


G(G,+8,+...)=G(F,)+GE(%,) +... 


The concept of completely additive function has already been men- 
tioned. There is a direct connection between the concepts of com- 
pletely additive and normal functions, which is expressed by the 
following theorem: 

THEOREM. The necessary and sufficient condition for a function G(@), 
defined on a field T and taking only finite values, to be additive and 
normal is that it be completely additive. 

If the function is additive, it follows at once that, given Ac B, 
G(B— A) = G(B) — G{A). Let G(@) be additive and normal, and 
let us show that it is completely additive. Suppose that 2 is the sum 
of a denumerable number of pairwise disjoint sets , (n = 1, 2,...). 
We can write 


F=G,4...46,+[F -(@,+8,4+...+4,)] 
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and, since the function is additive, 
G(f@)=G(®,) +... + 4(F,)+G[F —(F, +... + &,)]. (53) 


But  — (@,+ ... + @,) is a vanishing sequence. We pass to the 
limit in equation (53), taking account of the normality: 


G(€@)= lim [G(6,)+...+ 4(@,)] = SG@,). 
N-»oo n=l 

This shows that G(Z) is completely additive. Now suppose, converse- 
ly, that G(#) is completely additive; we show that it is normal. 
Let {> @3> ... be a vanishing sequence. We have to show that 
G(@;,) > 0. We can write 


B= (@1 — F2) + (i — Bs) + «-- + (Fn, — Fn) + Fa» 


where the terms in brackets are disjoint; hence we have, since G(@) is 
additive: 
G (€,) = G (F}) — [G (F — F) + 
OP Oe OSs 8s GA) 
On the other hand, it follows from the formula 


oo 


Bi = 2 (Fe — Fes) 


k 


that G(@) is completely additive: 


oo n 
G(g) = SG, — Fx4,) = lim > G(F, — Fr-1) 
k=l no Ka] 
and a comparison with (54) shows that G(%;) > 0, which is what we 
wanted to prove. Above, we extended a non-negative, additive and 
normal interval function G(A) to a closed field Lg, the function 
G(@) thus obtained being completely additive. It can be shown that 
no other extension of G(4) to Lg is possible, given complete additi- 
vity. 


39, Independence of the choice of axes. Some remarks must be 
made regarding the independence of the measure on the choice of axes. 
The original function G(A) was defined on semi-open rectangles, 
whose sides are parallel to the X and Y axes. 

The solid Lg contains semi-open rectangles on the plane with any 
direction of side, since every such rectangle is the difference between 
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a closed rectangle and the closed set of points composing two sides 
and three vertices of the rectangle. The function G(@) is therefore 
defined, in particular, on all semi-open rectangles 4’, whose sides are 
parallel to some other Cartesian system of axes X’ and Y’, G(A’) 
being additive and normal on these rectangles. If, after choosing the 
new axes X’ and Y’, we start from G(A’) and extend it as indicated 
above, we arrive at a field Lg. It may easily be shown that this field 
is the same as Lg, and that, with the new extension that we perform 
by starting from G(A’), the same values of G(@) are obtained on all 
intervals as were obtained with the previous extension, which was 
performed by starting from G(4). This assertion is based on the fact 
that every open set O can be written either as the sum of intervals A,, 
no pair of which has common points, or as the sum of 4}, no pair of 
which has common points, where 


G(0)= S6(4) = So(dy. 
k=l kal 

It follows from this, by Theorem 4 of [34], that the exterior measures 
of any set in the two coordinate system are the same. It further 
follows from the definition of measurability that a set will be simul- 
taneously measurable in both coordinate systems, i.e. the field Lg, is 
the same as the field Lg. The coincidence of the measures on both 
axes follows at once from the fact that, by the above-mentioned 
theorem, G(@) is the strict lower bound of the measures of the open. 
sets covering @. A further remark: if G(A) is the ordinary area of the 
rectangle 4, i.e. the Lebesgue measure m(A), then G(A’) is the ordinary 
area of A’ [ef. II, 92]. 


40. The B field. As we have indicated, a closed field Lg depends 
on the choice of function G(4). We shall next indicate a closed field 
such that a set appearing in it belongs to any closed field Lg and, in 
particular, belongs to LZ. We take all possible closed fields 7’ such that. 
every 7 contains all closed intervals, and we form the family of sets B, 
consisting of sets belonging to all the above-mentioned closed fields 7’. 
It may easily be seen that the family of sets B is also a closed field. For, 
if f, and &, belong to B, they belong to all our closed fields 7, ie. 
the difference J, — %, belongs to all the 7, and hence to the family 
B. The second part of the definition of closed field may be proved 
similarly. The closed field B is therefore the common part of all the 
closed fields that contain all possible closed intervals. This closed 
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field B evidently appears in the composition of each Lg, since this 
latter also contains all closed intervals. Every open set has been seen 
[33] to be expressible as the sum of a denumerable number of closed 
intervals, so that the closed field B contains all open sets. Any closed 
set F is the complement of some open set O, i.e. can be expressed as 
the difference between the entire plane (an open set) and the set O, 
ie. the B field also contains all closed sets. The field of sets B was 
first considered by the French mathematician Borel (before Lebesgue). 
The sets belonging to the B field are sometimes called B measurable 
sets or Borel (measurable) sets. 

A different definition to the above can be given of the B field, viz. 
we reckon that a set & belongs to a B field if it can be obtained from 
closed intervals with the aid of the following two operations, applied 
a finite or denumerable number of times: 

(1) formation of the sum of a finite or denumerable number of sets 
already constructed; 

(2) formation of the product of a finite or denumerable number of 
sets already constructed. This definition requires certain explanations, 
but we shall not dwell on these. We shall also omit the proof that the 
new definition of B field is equivalent to the previous. We shall 
conclude the present section by proving two simple theorems. 

THEOREM 1. If & is any set of Lg, there exist two sets @, and @, 
belonging to the B field (and hence to the field Lg), such that 


%,cC& C@, and G(%,) =G(¢,) =G(@). (55) 
We know that there exist, for a set J belonging to Lg, closed sets F, 
and open sets O, such that 


F,cC& CO; G—F,)<+; G(O,—8) <—. (56) 


The sets F,, and O, belong to the B field. Consequently, by the 
definition of closed field, we can assert that the sum of sets F,, and 
the product of sets O, belong to the B field: 


se $2= [7 On. (57) 


Recalling that F,c @c O,, we can assert that 2,c Fc @,. 
Moreover,  —@,cC @— F, and @,—€ < O,— @, so that 
G(F — &,) < 1/n and G(@, — &) < ln for any n. The left-hand side 
is independent of n, so that the last inequality leads to equation (55), 
and the theorem is proved. It can be stated as follows: every set @ of 
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Lg can be included between the sets of B that have the same measure as 
the given set @. 

A family of sets, which will be useful later, can be distinguished 
from the B field. 

DerriTion. A set 2 is said to be a G; set if it is an open set or the 
product of a denumerable number of open sets. 

We notice first of all that, as mentioned above [32], the product 
of a denumerable number of open sets may or may not be an open set. 
It follows at once from the definition of G; sets that the product of a 
finite or denumerable number of G; sets is also a G; set. Let us show 
that every finite closed interval Jd (@a<ax<b,c<y<d) is a G;, 
set. In fact, we can write it as the product of open sets A, (@ — en < 
<@a<b+ten,c—en<y<d-+e), where ¢, is a sequence of 
positive numbers tending to zero. It may readily be shown that every 
closed set is a G; set. We shall not require this in future. The following 
assertion is an immediate consequence of the theorem proved above: 

THEOREM 2. Every measurable set € can be covered by a set H of type 
G; such that G(@) = G(H). 

Notice also that, if @ belongs to the closed interval A, the covering set H 
can be chosen in such a way that it belongs to A. For, if H is a set G, 
covering @ and satisfying the condition G(?) = G(H), the set H’ = 
= HA will also be a G; set, covering 2 and satisfying the condition 
G(@) = G(H’), where H’ obviously belongs to 4. 


41. The case of a single variable. The theory of measure takes a 
simpler form in the case of one variable. As we know, a non-negative, 
additive and normal function of a semi-open interval reduces to a non- 
decreasing point function g(z): 


G (A) = G ([a, 6}) = 9 (b + 0) —g(a+ 0). 


Starting from this function G(4), as indicated above, we form a 
set function G(@), defined for all sets belonging to Lg. The value 
of G(@) for a set & belonging to Lg is sometimes called the variation 
of g(x) on the set &. If g(x) = 2, we get Lebesgue measurable sets, 
and G(@) is a generalization of the concept of length for such a set. 
If g(x) is defined only in some interval, it can be extended to the entire 
axis, a8 indicated above. 

We introduce instead of x the new variable ¢ in accordance with 


t= g(x), 
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the meaning of this change of variable being as follows. If g(x) is con- 
tinuous at a point x, the corresponding value ¢ is defined by (58). 
Whereas if x is a point of discontinuity, the closed interval [g(x — 0), 
g(x + 0)] of the variable ¢ is taken as corresponding to it. The semi- 
open interval (a, b] of the variable x becomes, with this correspond- 
ence, the semi-open interval (g(a + 0), g(6 + 0)] of the variable ¢; 
the latter interval degenerates to a point when g(b + 0) = g(a + 0). 
If e, are sets of the xz axis and e, are the corresponding sets of the ¢ axis, 
it can be shown that the exterior measure of e, with respect to 9(z) is 
equal to the exterior measure of e¢; in the Lebesgue sense, i.e. evaluated 
on condition that the length of a semi-open interval is taken as basis. 
In the case of a single variable, an elementary figure is the sum of a 
finite number of semi-open intervals having no common points, and it 
can readily be shown that, if e, is measurable with respect to g(x), e 18 
Lebesgue measurable, the measure of e, with respect to g(x) being 
equal to the Lebesgue measure of the set e;. 


§ 2. Measurable functions 


42. Definition of measurable function. The problem of this and the 
following sections is the construction of a certain class of functions and 
the investigation of the properties of the functions. A more general def- 
inition of integral will be given later on the basis of this class of func- 
tions. We shall assume in our treatment that the function G(4), on 
which the theory of measure is based, is fixed in some manner, i.e. we 
shall be concerned with a definite field Lg. This may be e.g. the field 
of Lebesgue measurable sets L. Let a point function f(P), that 
takes real values, be given on the measurable set &. These values may 
be finite or infinite, i.e. f(P) can take the values (+°°) and (—°o) 
as well as finite values. We introduce the following notation. We 
write S[f > a] for the set of points of at which f(P) > a. Similarly, 
&[f < a] means the set of points of Z at which f(P) < a. If f(P) and 
g(P) are two functions, the symbol @[f = g] denotes the set of points 
of at which f{(P) = g(P), and so on. 

Derinition, A function {(P), given on a measurable set 2, is said to be 
measurable if, given any real a, the sets 


Elf>al; F[f<al; F[f> al; F[f<a] (1) 


are measurable. We first prove the following theorem: 


114 SET FUNUTIONS AND THE LEBESGUE INTEGRAL [42 


THEOREM 1. A sufficient condition for the measurability of sets (1) 
for any a is that one of the sets be measurable for any a. 

The sets [f >a] and @[f <a] are complementary, and the 
measurability of one of them for any a is equivalent to the measura- 
bility of the other. Similarly, the measurability of the third of sets (1) 
is equivalent to the measurability of the fourth. Let us show say that 
the measurability of the third set for any @ implies the measurability 
of the remaining sets. In fact, the measurability of the third set im- 
plies the measurability of the fourth, and of the set @[f > a], which 
can be written as 


e[f>a]= J] 8 |f>e— |; 


so that the second set is also measurable. 
We notice also that the sets [f = +-co] and @[f = —] can be 
written as 


Ff = + 00] = J if > 2h & [f= — 0] = IT# Uh < —n). 


Actually, it is sufficient to prove the measurability of (1) only for 
rational values of a. For, every irrational a can be written as the 
limit of a decreasing sequence of rational a,, and the measurability of 
&@[f > a] follows directly from the formula 


Flf>al= SFUf>a,|. 


We shall give a number of simple properties of measurable functions, 
following directly from the above definition. 

THEOREM 2. If f(P) is measurable on 2, it is measurable on any 
measurable part &’ of the set &. If f(P) is measurable on a finite or de- 
numerable number of pairwise disjoint sets fp, it is measurable on the 
set f representing the sum of the fp. 

These statements follow directly from the formulae: 


B'(f>a])=F{f>a]-e; Slf>ae]= SZ, [f >a]. 


TueEoreM 3. If & is a set of measure zero, any function f(P) is measur- 
able on this set. 

For, given any a, the set [f > a] is part of the set 2, which has 
measure zero, i.e. the set [f > a] has measure zero, i.e. f(P) is 
measurable. 
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Derririon. Two functions f(P) and g(P), defined on w set &, are 
said to be equivalent on this set or simply equivalent, if the set Z[f 4 g] 
has measure zero. We prove the following theorem on equivalent 
functions. 

THEOREM 4. If {(P) and g(P) are equivalent functions on a measurable 
set , and one of them is measurable, then the other is measurable. 

By hypothesis, the set 7[f # g] = Ais aset of measure zero. On the 
measurable set J’ =  — A, we have f(P) = g(P). The measurability 
of {(P) on & implies the measurability of f(P) on &’, so that g(P) is 
also measurable on &’. By Theorem 3, the function g(P) is measurable 
on the set A. Hence, by Theorem 2, g(P) is measurable on the set f = 
= &’-+ A, and the theorem is proved. 

It is easily shown that, if f, is equivalent to g,, and /, equivalent to g,, 
then /, + f, is equivalent to g, + 9 f,/, is equivalent to g, g,, and 
filf, is equivalent to g,/g,, provided the relevant operations have a 
meaning almost everywhere. 

If two continuous functions are equivalent in the sense of the Lebesgue 
measure on some interval or throughout the plane, it is easily seen 
that their values are the same at every point. For if we were to have 
say /(P,) — g(P>) > 0 at some point, this inequality would be retained, 
by virtue of the continuity of the functions, in some sufficiently small 
e-neighbourbood 6 of Py, where m(é) > 0, and this contradicts the 
definition of equivalent functions. 

Let us quote some simple examples of measurable functions. Let 
/(P) be continuous in a finite closed interval 4,. Given any a, we take 
the set 4, [/(P) > a] and show that it is closed. It will then follow 
immediately that it is measurable, so that f(P) is a measurable function. 
If P, (n= 1, 2, ...) is a sequence of points having a limit point P, 
and /(P,) >a, then f{(P) >a by virtue of the continuity of the 
functions, and this shows that the set A, [f(P) > a] is closed. Similarly, 
if /(P) is continuous throughout the plane, it is measurable. For, if A, 
is any closed interval, the set 4, [/(P) > a] is measurable, as we have 
just shown. The limiting set will also be measurable on extension 
of Ay. This limiting set isthe set of allthe points ofthe plane where 
{(P) > a. 

Now let /(P) have a point of discontinuity P). We cover it by a 
sequence of open sets 4, (n = 1, 2,...), which shrink indefinitely 
to Py. Outside 4, the function f(P) is continuous and the set e, of the 
points P where f(P) > a is closed. As n increases, the sets e, do not 
decrease and tend to the measurable set e. We must also add P, to 
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this set, if f(P)) > a, and thus obtain the set of all points at which 
}(P) > a, this set being measurable in view of the above. The same 
arguments apply in the case of a finite number of points of discon- 
tinuity, i. a function with a finite number of discontinuities is 
measurable. 

The following statements will be given without proof: if {(P) takes 
finite values on the closed interval 4, and the set of its points of 
discontinuity has measure zero, /(P) is measurable on 4. But this 
condition for measurability is merely sufficient. An example is easily 
given, in which every point of a set % is a point of discontinuity, yet 
the function is measurable. We take the function f(x), defined on the 
interval [0, 1] as follows: /(x) = Oifzisarational number, and f(x) = 
= 1 if x is irrational. We take the Lebesgue measure, i.e. the case 
when G(4) is the length of an interval. The measure of any point is now 
equal to zero. The rational points of the interval [0, 1] form a denumer- 
able set, and in view of the fact that the measure is completely additive, 
the set of rational points also has measure zero. The given f(x) differs 
from a function identically unity throughout the interval only on the 
set of rational points having measure zero, i.e. f(x) is equivalent to a 
function identically equal to unity, and f(z) is measurable by Theorem 
4. But it is easily seen that every point 2, of the interval [0, 1] is a 
point of discontinuity of f(x). For, there are both rational and irrational 
values of z in e-neighbourhood of 2p, i.e. /(z) takes both the value 0 
and the value 1 in any e-neighbourhood of x = 2p, so that x, is a point 
of discontinuity. We shall indicate in [45] the deep bond between the 
concepts of measurability and continuity. 

We shall also consider the so-called piecewise constant function on 
a measurable set @, i.e. the function /(P) which takes a finite or 
denumerable number of values c, (k = 1, 2, ...) on &. If the sets &; 
on which /(P) = cy, are measurable, it follows at once from the de- 
finition of measurability that the piecewise-constant function /(P) 
is measurable on @. Let us give a further example. Let /(P) be measur- 
able on the measurable set Z.Suppose that it is zero on the complement 
C&@. The function thus formed is measurable on @ and on C@, i.e. by 
Theorem 2, is measurable throughout the plane. 

To mention a further case of a single variable, let g(z) be a non- 
decreasing function on which the measure is based [42], and f(z) a 
measurable function. It is sometimes said in this case that f(z) is 
measurable with respect to g(x), whilst if g(z) = 2, we simply say 
that f(z) is measurable. 
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43, Properties of measurable functions. Certain further properties 
of measurable functions are worth mentioning. 

THEOREM 1. If {(P) ts a measurable function, | {(P) | ts also a measur- 
able function. 

This follows at once from the formula: 


®[|fl>a]=—F[f>ea]+é[f< —al]. 


THEOREM 2. I} f(P) is a measurable function and c is a finite constant, 
different from zero, c + f(P) and cf(P) are measurable functions. 
The first assertion follows at once from the formula: 


@[e+/(P) >a] =[f(P)>a—c], 


and the second from the formulae: 
& [cf (P) >a] =8(f(P) > ~| for c>0, 
& [cf (P) > a] = alata < =| for ¢ <0. 


THEOREM 3. If f(P) and g(P) are measurable functions, the set F[f > g] 
is measurable. 

We enumerate all the rational numbers: 7,, 7,, . .. The measurability 
of the set of the theorem follows at once from the formula 


el > a= Se l> nde lo <n. 


THEOREM 4. If f(P) and g(P) are measurable functions taking finite 
values, the functions f — g, f + 9, fg and f{g (when g # 0) are measur- 
able. 

The measurability of f — g follows at once from 


&(f—g>al=[f>a+g] 


and Theorems 2 and 3. The measurability of the sum follows from 
f/+4g9=f-— (-—g) and Theorem 2 with c = —1. The measurability of 
the square /? of the measurable function / follows at once from 


& [P >a] =2[f> Va] + &[t < — Va, 
whilst the measurability of the product fg follows from 


fg=—[t +9 —V—9)]- 
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We use the following formulae to prove the measurability of 1/g, on 
condition that g does not take the value 0: 


6|—>a|=% [g>0]-8|9<—| for a > 0; 
s{—-> a|=8[g>0)+8[9<—| for a< 0; 
6|—>a]=8[7>0] for a=0. 


Finally, the formula //g = f(1/g) implies the measurability of the 
quotient. It is necessary to make the proviso in this theorem that /(P) 
and g(P) take finite values at every point of &. Otherwise, the ope- 
rations on these functions may become meaningless. 

If, say f = +°° at some point and g = —»°, we could not speak 
about the sum / + g at this point. If there is no such indeterminacy 
in performing the operations on f and g, infinite values are per- 
missible for /(P) and g(P). For instance, the following theorem may be 
proved. 

THrorem 5. If {(P) and g(P) are measurable functions taking finite 
values and the value (+-°°), the function f + g is measurable. 

Let A be the set on which at least one of the function is equal to 
(--c°). This set is measurable by virtue of the measurability of f and 
g, and the sum f + g has the constant value (-+ °°) on the set A, i.e. 
is measurable. Both the functions f and g have finite values on the set 
@’ = @— A, and by Theorem 4, the sum f + g is measurable on 2’. 
It is therefore measurable on & = &’ +. A, which is what we had 
to prove. 


44, The limit of a measurable function. We investigate in this 
section passage to the limit for measurable functions. Our fundamental 
result will be that a passage to the limit for measurable functions 
leads to another measurable function. Some preliminary facts in 
connection with limits must be given. Let 


Gy, Ay, Ay, --- (2) 


be a sequence of real numbers, which may possibly include (-+°<) 
or (—°o). Let s, denote the strict lower bound of the set of numbers 
[@n, @n4,, ..-] and ¢, the strict upper bound of this set, i.e. 


8, = inf [a,, Qayp---]; te = sup [a,, dyy.,.--]. (3) 
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As 7 increases, this number set is impoverished, i.e. s, does not 
decrease and ft, does not increase. Hence, as n increases indefinitely, 
the monotonic sequences s,, and t, have finite or infinite limits: 

lims,=8S; lim t,=T7, (4) 


Ti >0 Ti-—hoo 
where, in view of the monotonicity, 
S=sups,; TJ = inft,, (5) 


and, in addition, s, < f, implies S < 7. As regards the sequence 
(+00), (+°°), ..., we assume that its limit is (+-°°), and similarly for 
(—:o). The number S called the lower limit of sequence (2), whilst 7 is 
the upper limit of the sequence. 
The following symbols are often used: 
S = lima, or 8 =lim-infa,; 


N+ 


T =lima, or T=lim-supa,. 
n+@ 

Let us prove the following lemma. 

Lemma. The necessary and sufficient condition for the existence of a 
limit (finite or infinite) of sequence (2) is that S = T, and if this con- 
dition is fulfilled, the limit is equal to 8. 

We first prove the sufficiency. We have s, <a, <t, for k > n, and 
if the limits of s, and ¢, are the same, i.e. S = 7’, obviously a, — S. 
We now prove the necessity. Let the sequence (2) have a finite limit ¢. 
All the numbers a,, now lie in the interval (o — «, o + «) for suffi- 
ciently large n, e being any given small positive number. Hence this 
interval contains all the s, and ¢, for sufficiently large n. It follows, 
since ¢ is arbitrary, that s,—> o andl, > 0, ie. S = T = o. The case 
of an infinite limit of sequence (2) is similarly considered. We now prove 
some properties of sequences of measurable functions. 

THEOREM 1. If f,(P) is a@ sequence of measurable functions, the strict 
lower and strict upper bounds of the values of f,(P) at any point P of set & 
are also measurable functions, t.e. the functions 


p (P) =I Ie (P) and y(P) = sup fn (P) (6) 


are measurable. 
Let us prove that say g(P) is measurable. If we have o(P) < a at 
the point P, at least one of the values of f/,(P) is also < a, and con- 
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versely, if at least one of /,(P) < a, then g(P) < a. We therefore have 
8 [e(P) <a]= SF [f,(P) <a], 
n=1 


whence it follows, since the /,(P) are measurable, that ¢(P) is measur- 
able. 

THEOREM 2. If f,(P) ts a sequence of measurable functions, monotonic- 
ally increasing (or monotonically decreasing) at every point P of the set 
@, the limit function f(P) is also measurable. 

This follows at once from the previous theorem, since the limit 
function of a monotonically increasing sequence of functions is the 
same as its strict upper bound »(P), and the same as the lower bound 
g(P) for a monotonically decreasing sequence. 

TrroreM 3. If f,(P) is a sequence of measurable functions, the lower 
limit S(P) and upper limit T(P) of the sequence are also measurable 
functions. 

We introduce the functions 

Sn (P) = inf [f,, (P), fn41 (P); : 3 tn (P) = 5up fn (P), frat GaN -+). 

They are measurable for any n by Theorem 1. The functions S(P) 
and 7'(P) are the limits of monotonic sequences s,(P) and t,(P), so 
that, by Theorem 2, they are also measurable functions. 

THEOREM 4. Jf f,(P) is a sequence of measurable functions, convergent 
at every point P of a set &, the limit function f(P) is also measurable. 

The measurability of {(P) follows from Theorem 3, since the limit 
/(P) must be the same as S(P) and 7(P) when it exists at every point. 
This last theorem is fundamental for what follows, and we shall 
generalize it to some extent. 

We say that a property holds almost everywhere on @ if it holds at 
all points of & except for a set of points of measure zero. 

TuHEoreEM 5. If f,(P) is a sequence of functions measurable on @, and 
convergent almost everywhere on @, the limit function f(P) is measurable. 

Notice that the limit function {(P) may not be defined on some 
part A of the set , where A is of measure zero. We define /(P) on A 
in any manner. The sequence /,(P) is convergent at every point of the 
measurable set d’ = & — A, and, by Theorem 4, /(P) is measurable 
on 2’. In addition, it is measurable on A by Theorem 3 of [42]. Con- 
sequently {(P) is measurable on the set = %’ + A, and the theorem 
is proved. 

We introduce the new concept of the convergence of a sequence 
of functions. 
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Derrnirion. Let f,(P) and f(P) be functions measurable on & and 
taking finite values. We say that the sequence f,(P) is convergent in 
measure to f{(P) on & if, given any positive ¢, the measure G(Z,) of 
the set ,, of points at which the inequality | {(P) — f,(P)| > e holds, 
tends to zero on indefinite increase of n. 

The connection between convergence in measure and convergence 
almost everywhere is established in the next two theorems. 

THrorEM 6. Let J be a measurable set of finite measure and f,(P) a 
sequence of functions measurable on &, which take finite values almost 
everywhere on & and are convergent almost everywhere on & to the function 
{(P), which also takes finite values almost everywhere on @. Now, fn(P) 
are convergent in measure to f(P) on &. 

Let < be a given positive number. We introduce the set of points ,: 


€,=F[|P)—f,(P)|> 4. 


We have to show that G(%,) ~ 0. We introduce the set of points 
at which /(P) and f,,(P) take infinite values, and the set on which /,(P) 
does not tend to f(P): 


A=8[jf(P)|\=+~]; 4,.=%[|f(P)|=+ eI; 
B= [f, (P) does not-> f (P)]. 


By hypothesis, all these sets are of measure zero. The same can be 
proved for their sum [36]: 


C=A+ SA,+B, 
n=1 


ie. G(C) = 0. If P, does not belong to C, fn(P>) and /(P,) have finite 
values, and /,(P,) > f(P,). We introduce the sets 


R, = >&, and S= JT R,. (7) 
k=n n=1 
The sequence &, (n = 1, 2, ...) is a non-increagsing sequence of sets 


of finite measure, since @ has finite measure, and S the limit set for R,, 
so that 
G (R,) > G (8). (8) 


We show that Sc C, i.e. that, if P, does not belong to C, then Py 
does not belong to J. In fact, if P, does not belong to C, f,(P,) and 
/(P) are finite and f,(P,) —> /(P,), ie. there exists an N such that 
If(Po)— fn (P)| <e for n >N. Hence it follows that P, does not belong to 
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@, for n > N, i.e. Py does not belong to R, for n > N, so that P, 
does not belong to 8S. Hence Sc C. But G(C) = 0, so that G(S) = 0, 
and by (8), G(R,) > 0. But, by the first of formulae (7), 7, Cc Ry, ie. 
all the more G(Z%,,) > 0, which is what we wanted to prove. 

Note. We can associate the set C with all the %,. Since G(C) = 0, 
we again have G(@,,) — 0 after such addition, whilst | {(P) — fr(P) | < 
< eat every point of the set (* — @,,). Convergence in measure does 
not necessarily imply convergence almost everywhere, but the follow- 
ing theorem holds. 

TororEM 7. Let be a measurable set of finite measure, f,(P) and 
}(P) be measurable on @, where f,(P) are convergent in measure to f(P) 
on &. There now exists a subsequence fn,(P) that tends to {(P) almost 
everywhere on @. 

We choose a sequence of positive numbers 6; (k = 1, 2, ...) such 
that 6,—> 0 as k—» co, and a sequence of positive numbers «, such 
that the series ¢, + € -+ ... is convergent. In view of the convergence 
in measure, there exists an indefinitely increasing sequence of sub- 
scripts n, such that G(#;,) < e, for the sets F, = Ff | f(P) — fn P) | > 
> 6,]. We introduce the sets 


R= SFs S=JTF,. 
k=n 
It is easily shown that G(S) = 0. For 


G(R.) < SG) < See 


k=n k=n 


and the last sum — 0 as n-» © by virtue of the convergence of the 
series ¢,-+ 6, +... We now show that f,(P)—/(P) on the set 
&@ — §. Since G(S) = 0, this will prove the theorem. 

Let the point P, €  — 8 and hence P, € 8. Hence it follows that 
P, does not belong to #, for all sufficiently large k, so that P, does 
not belong to 2; for all sufficiently large k, i.e. there exists an N such 
that P,€ &, for k>N. On recalling the definition of 2, we get 
| f(Po) — fn(Po) | < 5x for & > N, whence it follows that f,,(P ) > 
—> f{(P,), since 6,20 as k> ©. 

Note. It might obviously be assumed that, as in Theorem 6, 
f,(P) and f(P) are only finite almost everywhere on 2, and that 
fx(P) is convergent in measure to f(P) on the set remaining after 
exclusion of the sets A and A, from @. 
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There is a theorem connecting convergence almost everywhere 
with uniform convergence. This theorem was proved in 1911 by 
Egoroff. We shall merely state the theorem, since we make no use 
of it in future. 

THErorEM. Let & be a measurable set of finite measure and f,(P) 
a sequence of functions measurable on & , which take finite values almost 
everywhere on & and are convergent almost everywhere on @ to the 
function f(P), which also takes finite values almost everywhere on @. 
Now, given any positive ¢, there exists a closed set F belonging to 
S such that G(f — F) < « and the convergence f,(P) > f(P) ts 
uniform on F. 


45. The C property. Jt can be shown that the measurability of a function is 
equivalent to another property — the C property, which is defined with the aid 
of the concept of continuity. 

We must first introduce some new concepts. 

A function f(P), defined on a closed set F, is said to be continuous at a point 
P,, of this set if, given any positive e, there exists a positive 7 such that | f(P,) — 
—f(P)| < e, if P ¢€ F and belongs to an »-neighbourhood of the point P,. 
The function f(P) is said to be continuous on the closed set F if it is continuous 
at every point of #. Notice that, by virtue of our definition of continuity, any 
function is continuous at an isolated point P, of a set, i.e. at a point, an e- 
neighbourhood of which contains no points of # except P,. A similar definition 
can be given of continuity on any (not necessarily closed) set. We now introduce 
a further concept. . 

Derinirion. We say that a function f(P), defined on a measurable set &, 
has the C property on this set if, given any positive e, there exists a closed set F 
belonging to € such that, firstly, G(é — F) < ¢, and secondly, f(P) 1s continuous 
on F. 

The equivalence of the C property and measurability was established by a 
theorem proved by Luzin in 1913. 

THEOREM. If a function f(P) is defined on a measurable set € of finite measure 
and has finite values almost everywhere on €, the necessary and sufficient condition 
for this function to be measurable is that it has the C property on &. 

We shall make no use of this theorem and shall not dwell on the proof. 


46, Piecewise constant functions. We now define a class of functions 
that are often used in theoretical investigations. 

Derinition. A function f{(P), defined on a measurable set &, is 
said to be piecewise constant on this set if it takes only a finite or 
denumerable set of values on @. 

Let o, (k= 1, 2,...) be the different values taken by /(P) on @, 
which may possibly include (—oco) and (+c). For f(P) to be 
measurable, it is obviously necessary and sufficient that the sets of 
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points 2, on which f(P) is equal to c, be measurable for all k [42]. 
We shall in future take into consideration only measurable piece- 
wise constant functions. 

We bring in a new concept. If 2’ is a set of points, the characteristic 
function of this set is defined as the function w,(P), defined through- 
out the plane, such that w,(P) = 1if P belongs to #’ and w,(P) = 0 
if P does not belong to 2’. A piecewise constant function f(P) is a 
linear combination of characteristic functions: 


f(P) = > CK 8: (P), (9) 
k 


where P belongs to @. Since the #, have no common points (the c;, 
are different numbers), only one term in the sum written differs from 
zero except in the case when the c; corresponding to the chosen point 
P is zero. All the terms are zero in this latter case. 

Obviously, the characteristic function w,.(P) is measurable when 
and only when 2’ is a measurable set. 

We next prove the possibility of obtaining measurable functions as 
the limits of piecewise constant functions. We shall confine ourselves 
here to non-negative functions. 

THEOREM 1, Given any function, non-negative, bounded and measur- 
able on a measurable set @, there exists an increasing sequence f,(P) of 
non-negative measurable piecewise constant functions on & with a finite 
number of values, which is uniformly convergent to f(P) at every point of 2. 

Since {(P) is bounded, a positive number Z exists such that 0 < 
< f{(P) < L. We divide the interval [0, Z] into 2” equal parts by the 
points 


=k (k=1, 2,..., 2-1). 
We bring in the measurable sets 
LD. L 
gn) =8 [kx <f(P) <(k +1) =| 
and construct a sequence of functions /,(P) as follows: 
f,(P)=k, if PE sy. (10) 


It may easily be seen that the sequence /,(P) satisfies all the re- 
quirements of the theorem. Each of the /,(P) takes a finite number of 
values on @. Further, on passing from n to n + 1, each interval 


[tse +O] 
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is split into two: 


L i? 
Epos (2k + 1) 


7 


[2+ Ug Ok + Daal, 


and 


so that each of the sets J!" is split into two sets: 


gM — —_ Fry Ye FYLY. 


On the set #9‘? the function f,4,(P) is equal to the same number 
kL/2 as the Falictied fn(P) on the whole of the set ¥,”, whilst on the set 
&Y4) the function fn4,(P) is equal to 


on a onF1’ 


i.e. the sequence of /,(P) is increasing. Further, we have on any set 


gs; 


fn (P) =k 
and 
k% <fle)<ktne. 
Thus 
O</(P)—f,(P Pye 


at every point P belonging to 2, whence it follows that the sequence 
/n(P) tends to /(P) uniformly on &. We consider in the next theorem 
the case when /(P) may be unbounded. 

THEOREM 2. Given any function f(P), non-negative, finite and measur- 
able on the set &, there exists an increasing sequence of f,(P), non- 
negative and piecewise constant on &, such that it tends to f(P) uniformly 
on @. 

In this case we subdivide the infinite interval [0, +-°co) with the aid 


of the points 
k 
on 


(k=1, 2, 3,...). 


We again define sets Sf) = &[(k/2") < f(P) < (k + 1)/2"] and 
functions 


XL, = 


fi(Pl=or, if Pe ey. (11) 
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It can be shown, precisely as in the previous theorem, that the 
sequence /,(P) satisfies all the requirements of the theorem. In the 
case of an unbounded function /(P), the f,(P) will take a denumerable 
number of values. If we relinquish the uniformity of the convergence 
we can confine ourselves to piecewise constant functions with a 
finite number of values. In addition, we shall assume in the next 
theorem that (+ °°) is a possible value of f(P). 

THEOREM 3. Given any non-negative function, measurable on a 
measurable set @, there exists an increasing sequence g,(P) of non- 
negative piecewise constant functions on & with a finite number of finite 
values, which tends to f(P) at every point of &. 

In addition to the sets ¥ ") of the previous theorem, we also form 
the set 7, = F[f(P) = +] and introduce the sequence of functions 
Qn(P) as follows: 


IN 


Pr(P)=fa(P), if fx(P) <n 


and 9,(P) = nif f,(P) > n or P € &4. It is easily seen that the ,(P) 
satisfy all the requirements of the theorem. We shall have occasion 
to use these theorems later. 


47, Class B. We mentioned in [41] a closed field of point sets, 
such that every set belonging to this field appears in any field of sets 
Lg. Similarly, we shall now indicate a family of functions such that 
any function of the family is a measurable function for any choice of 
measurable function G(4). 

Derinition. A function f(P), defined on a set @, which is B measur- 
able, is described as a B-function if the sets 


®[f>a]; S[f<a]; S[f>a]; Fl f<al 


are B measurable for any real a. 

It follows at once from this definition that every B function is 
measurable for any choice of G(4). Another definition of B function 
can be given, completely analogous to the definition of B measurable 
sets that we gave in [41]. We take all possible families of functions 
possessing the following two properties: firstly, the family contains all 
functions continuous on @, and secondly, if the family contains a 
sequence of functions f,(P) convergent at every point of 2, the family 
also contains the limit function. A family of B functions is a family 
of functions that belong to all families of functions with the two pro- 
perties indicated. We shall not dwell on the proof that this last 
definition is equivalent to the previous one. 
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Certain details regarding the last definition must be mentioned. 
Every continuous function is a B function, and such a function is 
usually said to belong to the zero class. If /(P) is the limit function of a 
sequence of continuous functions convergent at every point of @, f(P) 
itself being discontinuous, such an /(P) is said to belong to the first 
class. Every function of the first class is also a B function. If f(P) 
is the limit function of a sequence of functions of the first class, 
convergent at every point of J, where /(P) itself is not of the first class, 
we say that /(P) belongs to the second class. Every function of the 
second class is also a B function. The subsequent classes of function 
are similarly defined, and all the functions of these classes are B 
functions. Let us proceed further with the construction. Let f(P) be 
the limit function of a sequence of functions /,,(P), each /,(P) being a 
member of some class with a finite index (number), whilst /(P) 
itself does not belong to a class with a finite index. We now say that 
f{(P) belongs to the class of functions with transfinite index w. Every 
function of this class is a B function. We can further define a class of 
functions with transfinite index (@ + 1), and so on. Allthe B functions 
can be obtained by this method. This assertion requiresa supplementary 
discussion of transfinite numbers, which we must omit. 

It can be shown that every function f(P), measurable on a set @, 
which is B measurable, is equivalent to some B function g(P) on 
this set. 


§ 3. The Lebesgue integral 


48. The integral of a bounded function. We shall now give an 
ordinary definition of the integral of a bounded function by using a 
subdivision into all possible measurable sets, and show that every 
bounded measurable function is integrable. Let f/(P) be a given 
bounded a function on a measurable set # of finite measure, i.e. 
we have | /(P)| < LZ on @, where L is a positive number. We subdivide 
@ into a finite number of pairwise disjoint measurable subsets 2: 


k=1 


Let m, and M, be the strict lower and strict upper bounds of the 
values of /(P) on &,. We form the usual sums 


= Sm6 (F,); Sy = SMG Ox), (2) 
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where 6 denotes the subdivision (1) of set @. The sums s, and S; are 
bounded for any subdivisions, in fact |s,| and | 8,;| << Z- G(@). 
Further, let i be the strict upper bound of the sums s, and J the strict 
lower bound of the sums 8, for all possible subdivisions of # into a 
finite number of measurable sets. 

Derimirtion. If i= 1, we say that f(P) is integrable with respect to 
G(@) on the set €, and the value of the integral is taken equal tot: 


i= {f(P)G (de). 


é 


The integral thus defined is called a Lebesgue-Stieltjes integral. 
If 6 is the subdivision (1) and 6’ is any other subdivision: 


= >8i, (3) 
k=1 


the product 66’ of the subdivisions is defined as the subdivision con- 
sisting of all possible subsets 2; %j. These sets are obviously dis- 
joint. Certain of them may in fact be empty. Subdivision (3) is 
said to be an extension of subdivision (1) if every set J; is part of one 
of the &,. If 6, is an extension of 5,, we write 6, > 6,. 

In addition to sums (2), we form the sum 


a= S HP) G (By); (4) 
k=1 


as for the Stieltjes integral, where P, is a point of F,. Everything that 
was said in [3] holds for s;, 83, 05, ¢ and J. 

We next indicate a sequence of subdivisions for any bounded func- 
tion {(P) measurable on J such that S; — s,—~> 0, i.e. a, has a definite 
limit. It follows from this that the integral 7 of /(P) with respect to 
G(@) exists, and that s;, S; and o, tend toi for the sequence in ques- 
tion [3]. 

Suppose, then, that /(P) is a bounded function, defined and measur- 
able on @, and let m and M be the strict lower and strict upper bounds 
of the values of {(P) on &. We divide the interval [m, M] of variation 
of the function into sub-intervals by the points y;: 


M = Yo KY. < Ye <0) <Yn <9 = MU, (5) 


and let 7 be the greatest of the differences y, — Y,-,. We define the 
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following subdivision 6 of the set into measurable subsets @,: 


%, =F ly <f(z)<yj; F=f [ya <f(z)<y | (6) 
ee 8s eh Wh) 


It follows at once from this definition of sets 7; that y,_, < m, and 
Yn > My, ie. 


D> Yi F (Fr) S845 By << Sy, G (F;) (7) 
k=l 
and all the more 


n n 
> HG (Fi<t<l< > YG (Fx). (8) 
k=1 k=1 


We consider the difference between the extreme sums: 


YG (Fx) — SHE (F,) = S% — Yn—1) G (By). (9) 


k=1 k=1 k=1 
Noticing that y, — y,-, < 4 and that G(&%) is additive, we get 


0< Sy,G(F) — S10 By) <1G(8), (10) 


k=1 


so that the difference in (10) tends to zero as 7» 0. Hence, by (7) 
and (8), it follows that 7 = J and 8, — s,—> 0. Subdivision (6) of the 
basic set into subsets %; is called a Lebesgue subdivision. It is de- 
fined by the subdivision (5) of the interval [m, M] of variation of the 
function /(P). The sums of (7) and (8), corresponding to subdivision 
(6), are called Lebesgue sums. The above discussion leads to the follow- 
ing fundamental theorem. 

FUNDAMENTAL THEOREM. A bounded measurable function {(P), given 
on a measurable set & of finite measure, is integrable on @, and the value 
of the integral is equal to the limit of the Lebesgue sums or the sums o; 
for any choice of points P, of the Lebesgue subdivision, on indefinite 
refinement of the subdivision of the interval [m, M] of variation of f(P). 

Notice the familiar fact that the sums o, have the same limit for 
any extensions of the subdivisions mentioned in the fundamental 
theorem. Since the integral is defined in the usual way, as the limit 
of sums o4, it retains the usual properties of the Riemann integral and 
the classical Stieltjes integral. We shall prove these properties in the 
next section. 
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We have termed the above integral a Lebesgue-—Stieltjes integral. 
It is called simply a Lebesgue integral in the particular case when 
G(A) is the area of the interval J. 

We have seen that every bounded function with a finite number of 
discontinuities is measurable. Let /(P) be such a function, given on 
a finite closed interval 4. We know that such a function is Riemann 
integrable over the interval 4. As a bounded measurable function, it 
is also Lebesgue integrable. Let us show that the Lebesgue integral is 
the same as the Riemann integral. In fact, the Lebesgue integral can 
be obtained simply by taking some sequence of subdivisions of A into 
measurable sets for which the sum (4) has a definite limit, which in 
fact gives the value of the Lebesgue integral. But since the function 
is Riemann integrable, the subdivision of 4 leads, on indefinite dimin- 
ishing of the sub-intervals, to a definite limit for sum (4), and this 
limit is the Riemann integral. It follows from these arguments that 
the Riemann and Lebesgue integrals are the same. 

Lebesgue showed that the necessary and sufficient condition for 
the existence of the Riemann integral is as follows: f{(P) is bounded 
and the set of its points of discontinuity has Lebesgue measure 
zero [cf. 10]. As we have shown, such a function is also Lebesgue inte- 
grable. The coincidence of the Lebesgue and Riemann integrals can be 
proved precisely as above. Thus every function, Riemann integrable 
over a finite closed interval (in the proper sense), is Lebesgue integrable, 
the Lebesgue and Riemann integrals being the same. 


49. Properties of the integral. We give below the basic properties 
of the Lebesgue-Stieltjes integral. It is assumed in all these theorems 
that # is a measurable set of finite measure. 

1. If c is a constant, then 

{ cG(d&) = cG (8). (11) 
¢ 

The sums s,; and S; have the value cG(@) for any subdivision 4, 
whence (11) follows [3]. 

2. If f,(P) and f,(P) are bounded and measurable on Z, then 


5 ff (P) + f(P)1G (de) = meee A(P)G (de) + SA (P)G (as). (12) 
¢ 
Let 6, and 6, be subdivisions, with which o;, for function /,(P) and 


6s for /,(P) have the corresponding integrals as limits. With the sub- 
division 6; = 6n 6,, the sums oy have the corresponding integrals as 
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limits both for 7,(P) and f,(P), and (12) follows from the theorem on 
the limit of a sum. In future, we shall not stipulate that the functions 
be bounded and measurable. 


P p 
3. § Suh (P) G (dF) = S cx f fe (P) G (GF). (13) 
& k=l k=l & 

The fact that a constant factor can be taken outside the integral 
sign follows at once from the possibility of taking the factor outside 
the brackets in the sums o,. In addition, property 2 has to be applied 
several times. 

4. If f(P) > 0 on @, then 


f(P)G (dF) > 0. (14) 
‘ 


All the sums oy are non-negative. 
5. If f,(P) > f,(P), then 


SAP G (dé) > J fa(P )G (dé). (15) 
It is sufficient to apply 4 to /,(P) — 7,(P) and make use of 3. 
6. ff) 6088) < FPP) 1G (a8). (16) 
é Fs 


Here, it is sufficient to take the product of subdivisions (8) for f and 
|f | and write the analogous inequality for sums ay. 
7. Ifa <f(P) <b on @, then 
aG(&) < (f(P)G(d%) < bG(é) (17) 
g 


follows directly from 5 and 1. 
8. If | f(P)| < Z, then 


G(aa) <LG(@). (18) 


By hypothesis, —L < /(P) < +, and (18) is a consequence of 7. 
9. If = %’ + &", where 2’ and @” are measurable and have no 
common pee then 


i f(P)G (dé) = 3 t(P)G(a&) + f§ f(P)G (dé). (19) 
2 


This is proved simply by taking subdivision (8) for sets 7’ and 2”, 
forming o, for these subdivisions and taking their sum. This last sum 
will have a definite limit, which proves (19). 
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10. If @ is split into a finite or denumerable number of measurable 
sets 3,, then 


{ f(P)G@ (dé) = > f {(P)G (dB). (20) 
é k & 


The formula follows at once, for a finite number of terms, by re- 
peated application of property 9. We take the case of an infinite 
number of sets @,. Let | f(P)| < ZL. We can write J = %,+ &, + 
+ ...+6,+ Ry, where Rk, = 6 — (6,+6,+...+4,) is ob- 
viously a vanishing sequence of sets, so that G(#,)—> 0 [87]. We 
obtain by applying the property for a finite number of terms: 


{F(P)G (dB) = Sf f(P) @ (as) + fHP)GB). — (V 
¢ k=1 & n 

We have for the last integral: 

[1P) 6 (48) < LE (R,), 

Ra 


and (21) gives (20) in the limit, since G(R#,) > 0. This property is 
generally known as the complete additivity of the integral. 
11, Given any « > 0, there exists an n > 0 such that 


f F(P)@ (dB) <e, (22) 


ifec & and Ge) < 7. 
This property follows at once from the inequality 


S HENGE) = LC): 


It is usually known as the absolute continuity of the integral. 
12. If & is a set of measure zero, i.e. G(F) = 0, we have 


f f(P)G@ (d&) =0. 
¢ 


for any function bounded on @. 

The function /(P) is measurable on J [42], and the sums s, and S; 
are zero for any subdivision. 

13. If f(P) and g(P) are equivalent on 6, then 


{ f(P) G (d&) = { g (P) G (dé). (23) 
€ ¢ 


Let A be the part of where f # g. This set A is of measure zero, 
by hypothesis. Functions {(P) and g(P) coincide on the set 0’ = 
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= & — A. We thus obtain the two equations: 

St(P) @(d%) = J 9 (PG (dé) = 0; 

A 


§ f(P)G@ (dF) = § g (P)G (dz), 
re - 
addition of which gives (23). 

14. If f(P) > 0 on & and 


§ f(P)G (ae) = 0, (24) 
é 


}(P) is equivalent to zero. We have to show that the measure of the 
set Z[f > 0] is zero. This set can be written as the sum of sets 


e(f> = Se[r>+], 
n=l 
and if its measure is positive, at least one of the component sets will 
be of positive measure. Let the measure say of the set B = @[f > 1f/ny] 
be positive. We split the integral into two: 


Sf(P) G (de) = Sf (P)G (az) + a f(P)G (dé). (25) 
¢ B 


Since / > 0, the second term on the right is non-negative. We have 
/ > 1/m, on the set B, so that the first term is > (1/n,)G(B). Thus, 
since G(B) > 0, the left-hand side of (25) is positive, which contra- 
dicts (24). 

15. If f,(P) is a sequence of functions, measurable on @ and uni- 
formly bounded, i.e. |/,(P)| <2, where L is a definite positive 
number (independent of ) and this sequence tends almost everywhere 
on @ to the limit function f(P), we have 


lim ff, (P) @(d&) = f f(P) G (dB). (26) 
N-+roo € € 

The limit function /(P) satisfies the inequality | f(P) | < £ almost 
everywhere on @. On passing to an equivalent function if necessary, 
we can assume that this inequality is satisfied everywhere on 2. The 
integral of a function /(P), measurable and bounded on @, has a 
meaning. 

We form the integral of {(P) — f,(P) and apply property 6: 
J Lf (P) — Fn (P)1 (48) | <VIF(P)— FCP) |G (de). (27) 

é 


& 
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Let ¢ be a given positive number and @,, the set of all points of 7 
at which | f(P) — f,(P) | > «. By Theorem 6 of [44], G(%,) > 0. At 
points of the set ( — %,), we have | f(P) — f,(P) | < «. In addition, 
we have at any point P of @: 


f(P) — fa (P)| <|F(P)| + fa (P) | < 22. 
The integral on the right-hand side of (27) is split into two: 


SFP) — fy (P) |G (dB) = 
= 
= { |f(P) — fa(P)|@ (d%) + § | #(P) —f,(P)|G (d8). 
én o-En 


It follows from this, by what was said above, that 
Sf (P) — fa(P) |G (AB) < 2LG (F,) + eG (8 —&,), 
& 


or, all the more, 


{| f(P) — f,(P)|@(d&) < 226 (&,) + eG (8). 
& 


Since G(f,)-—> 0, it further follows that an N exists such that 
G(@,) < « for n > N, and therefore 


SMP) —fa(P)|G (dF) <[2L4+G(#)Je for a>. 
é 


A comparison with (27) gives us 


{PPG MB) — fn (P)G (G8) <L+ Gye for n>N, 
¢ ¢ 


whence (26) follows, since < is arbitrary. Notice that (26) can be proved 
simply on the assumption that | /,(P) | < Z almost everywhere on @, 
By passing to an equivalent function if necessary, we can assume 
that this inequality is satisfied everywhere on @. Our last property 
establishes the possibility of passing to the limit under the integral 
sign with the single assumption that /,(P) are bounded in absolute 
value, independently of the subscript 7. We mentioned a similar pro- 
perty for the Stieltjes integral in [11]. It is a direct consequence of the 
theorem proved, since, when f,(P) and f(P) are continuous, the 
Lebesgue-Stieltjes integral reduces to the Stieltjes integral. Notice 
that, in the statement of property 15, we can replace the convergence 
of f,(P) to f(P) almost everywhere by convergence in measure. The 
proof remains the same in essence. 
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16. If m < f(P) < M on the set /(P), the function 
g(y) =G(F[m <f(P) <y)) 


is an increasing function of y, and the Lebesgue-Stieltjes integral 
reduces to the Stieltjes integral in accordance with the formula: 


M 
{ {(P)G(d&) = f ydg ly). (28) 
¢ m 


This can be seen simply by noticing that the Lebesgue sums (8) are 
ordinary sums s, and 8S, for the Stieltjes integral appearing in the 
rigth-hand side of (28). 

In the case of the Lebesgue integral, i.e. when G(A) is the area of an 
interval, the integral is often written as follows: 


{Vf (x, y) dx dy. 
§ 


Similarly, the following notations are used for the Lebesgue integral 
in the cases of a straight line and three-dimensional space: 


ff(zjdy and = ff (x,y, z) dw dydz. 
é & 


50. The integral of a non-negative unbounded function. We now 
define the integral when /(P) is an unbounded and non-negative 
measurable function on a measurable set % of finite measure. Here, 
we shall split into a denumerable, as well as finite, number of 
measurable subsets @;,. For the rest, the construction of the integral 
will be precisely the same as for a bounded function. Suppose we have 
some subdivision of @: 


F= > F,. (29) 
k 
We form the sums s, and S; corresponding to it: 


8, = 2m G (Fx); S,= = M,,G(%;). (30) 


We have here infinite series with non-negative terms, and certain 
of the numbers m, and M, may be equal to (-+°). If G(%,) = 0 for 
some term, the corresponding term is reckoned equal to zero even in 
the case when the first factor m, or AZ; is equal to (+-°°). The sums 
of series (30) do not depend on the order of the terms [I; 134]. Notice 
also that, if a finite subdivision 6 is taken, at least one of the M;, in 
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the sum S;, will be equal to +: because /(P) is unbounded. The sums 
$s; and S, can take infinite values. As above, let ¢ be the strict upper 
bound of s, and J the strict lower bound of S;. These numbers may be 
equal to (-++-°°o). We show that, as in the case of a bounded function, 
i = I. We divide the set 2 as follows: we first extract from it the set 
@, on which f(P) = -+-o°, if there is such a set, whilst we divide the 
remainder of the set into sets &, as follows: we subdivide the interval 
[0, +:°) by O=y<y¥, <y%< ... and form the sets 


€,=F[y<f<y]; Ey =F [Yr <f < y|- (31) 
We obviously have m, > yx-,, Mn < yx and 


(+ 00) G (Fy) + S ya G (Fx) < 8 < Sy < (+ 00) G (Fo) + 


k=1 
+ GE) (32) 
k=l 
and all the more, 


(+ 00) @ (Bo) + Sye1@ (Br) <t <1 < (+ 0) G (Gp) + 
k=1 


oo 


+ HEB). (38) 


k=l 


If G(fy) > 0, obviously i = I = ++0o. Now suppose that G(%,) = 0. 
Inequalities (32) and (33) can now be rewritten as 


> Yn F (Fr) < 85 < 83S SHG (Fi); (34) 
k=1 k=l 

> yr-1 FF) 7ST SY EG (Fi). (34) 
kal kal 


We shall assume that the subdivision of [0, 9) is such that the 
(Yx — Yr-1) (k= 1, 2, ...) are bounded. Let 7 be their strict upper 
bound. On noticing that y, < y,-, + 7, we can write 


SUB) < S16 (Fy) +768). (35) 
k= k=1 


Hence it follows immediately that, if the sum on the right of (34) 
is (°°) for some subdivision with finite n, the same can be said of 
the sum on the left of (34). Now, by (34,), 7 = J = +c. Conversely, 
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if J = +00, by (34,), the sum on the left of (35) is (-+-9°) for any sub- 
division with finite y, so that the sum on the right of (35) is also 
(+c), and, by (34,), 7 = +°c°. Hence, by (34), if S, is (4+) for some 
subdivision with finite 7, sis also (+ °°), and s; and S; are now equal 
to (+-°°) for any subdivision with finite 7. In these cases, i = I = +o. 
We have in the case of finite sums, as in [49]: 


0< SHE (Fi) — Sy-14 (Fi) < E(B), 
k=1 k=l 
and this difference tends to zero as 7-0, whence it follows that 
i = I, Both the Lebesgue sums (34), as also sums s, and S;, now tend 
to the value of the integral as 7 —> 0. The same can be said of the sums 


oy = SPs) OB) (36) 
k=1 


for any choice of P, of &;,. In the case i = J = +0, sum (36) is ob- 
viously also equal to (+ °°). If s,, S; and o, tend to the value of the 
integral (the case of a finite integral) and 6, > 6,, the same can be 
said of the sums sy, Sy, and ay;. 

The fundamental theorem of [49] can be carried over without 
change to the case of unbounded non-negative and measurable 
functions. A case of special importance is that when the integral is 
finite. In this case f(P) is said to be summable on the set @. It follows 
from what has been said that a necessary condition for a function 
to be summable is that the set f, = S[f = +] be of measure zero. 

Another definition, equivalent to the one above, can be given of 
the integral of an unbounded, non-negative and measurable function. 
We shall write [/(P)], for the bounded non-negative function defined 
as follows: 

ale = a it H(P) <N, 
N, if f(P)> NX. 
The measurability of this function follows at once from the formula 


@[fln > a) = [ff > a] fora < N and 2[[f]y >a] =A fora > N, 
where A is the empty set. We form the integrals 


in = J [fly @ (d2). (38) 


(37) 


They increase as N increases, and the integral of f(P) is defined as 
the limit of this monotonic variable (finite or infinite) as N ~ +0. 
Let us show that this new definition of the integral is equivalent to the 
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previous one. Suppose first that the set 2) on which f(P) = -+°co has 
measure zero. The value of integral (38) is equal to the upper bound ty 
of the sums s$‘) for the function [f]y. These sums do not exceed the 
corresponding sums 8; for /, so that iy < 7. We have to show that the 
monotonic variable 7, has a limit 7 as N — -+-co. We use reductio ad 
absurdum. Let in > 7’ < 7 (hence 7’ is finite). We can take a sum s, 
for /(P) such that s, > 7’. We retain in this sum a finite number of 
terms in such a way that s; > 7’ for the finite sum s; obtained: 


#3 = 2m, G (Bx) >’, (*) 


where the sum is finite, and the summation is over the remaining 
terms, as indicated by the prime on the summation sign. If m, = +°, 
{(P) = +00 at every point of Z,, ie. F.C Fo, so that G(F,) = 0, 
since G(@ 4) = 0 by hypothesis. As mentioned above, the corresponding 
term of the sum is taken equal to zero, and we can omit it. It can there- 
fore be assumed that all the m, are finite in the terms of the sum (*). 
We form the corresponding sum for [/],: 


sh) = 2 mG (By) (mf = inf [fw on By). 


If the number WN is greater than all the m, appearing in sum (*) 
(there is a finite number of these numbers), mi’) = m, and si%? = 8, > 
> i’, All the more, the complete sum s$? for [f] will be > 7’, so that 
iy = sup s& > 4’, which contradicts the fact that iy tends to 7’ 
whilst increasing. Hence ty — 7, and, with the second definition of the 
integral, its value turns out to be the same as with the first definition. 

If G(f,) > 0, the integral is (+-°°) with the first definition. Let us 
show that the value is the same with the second definition. We have, 
since the functions [/], are non-negative, 


in = J [fln@(dZ) > [[/]n G (de) = NG (Z,), 
€ Oe 


because, by the definition, [f/f], = N at every point of %,. It follows 
at once from the inequality i, > NG(Z%,) that iy > +°° as N-> +00, 
which is what we had to prove. 


51. Properties of the integral. By using the sums o;, we can prove 
certain properties of the integral of an unbounded non-negative 
function precisely as we did in [49]. We can also make use of the 
second definition of the integral in proving the properties. A further 
point: if f(P) is a bounded non-negative function, [f(P)]y is the same 
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as {(P) for sufficiently large N, and the new definition of integral is 
the same as the previous one (of [48]). 

We turn to the proof of the properties of the integral. As in [49] 
we assume that 2 is a measurable set of finite measure. 

1. If f,(P) (k = 1, 2, ..., m) are summable functions, a linear com- 
bination of them with constant coefficients is a summable function, 
and (13) holds. 

The proof is the same as in [49]. 

2. If f(P) is summable on @, it is summable on any meagurable part 
&’ of B. 

We have for [f(P)]x, by property 9 of [49] and the fact that it is 
non-negative: 


Sf Ply Gd®) < SAP) 


On passing to the limit, we get 
Sf (P) G (d#) < see (dg), (39) 
Po 


and if the right-hand side is finite, the left-hand side is all the more 
finite. 

3. If {(P) is summable on @ and the set @ is divided into a finite or 
denumerable number of measurable sets, (20) holds. 

We take the case of an infinite number of sets ,. We have for a 
bounded function [f]y: 


J lw (ae) = >i [flv G(d8), (40) 
whence it follows that 
SUlw 6 (ae 2 fi (P)G (a8). (41) 
We obtain on indefinite increase of NV: 
SIP) G(a8) < 2X Se G (dB). (42) 


Let us prove the reverse inequality. Since f(P) is non-negative, we 
ean write, by (40), for any given finite m: 


S fw @ (ae) > Pp [Gd 


140 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [ol 


We obtain in the limit, on indefinite increase of NV: 


HP) G( >> [/(2)G (a8). 
¢ 


=1 & 


On now increasing m indefinitely, we arrive at the inequality 


[A(P)G (8) > S (f(P)G 8) 
é k=l &k 
which is the reverse of (42). 
4. If @ is divided into a finite or denumerable number of measurable 
sets @,, the function /(P) is summable on each @; and the series 


= Ce (43) 


=1 & 


with non-negative terms has a finite sum (is convergent), /(P) is 
summable on 2, and (20) holds. 

We have (40) as above for the function [f],, and also inequality (41), 
the right-hand side of which is a finite number. It follows at once from 
this inequality that integrals (38) have a finite limit, i.e. f(P) is sum- 
mable on @. After this, (20) follows from the summability. 

5. If {(P) is summable on @, given any ¢ > 0 there exists an 7 > 0 
such that 

ff(P)G (dF) <« (44) 
when ec @ and Ge) < 7 
We can fix an NW such that 


[U-[Av]@@a<+ (> [Aly 
Now, by (39), we have for any ec @: 


{If — Uw] 648) < =, ie. [7G (42) < | [Aw O48) + > 


and we obtain the required inequality when G(e) < ¢/2N. 

The last two properties show that the integral of an unbounded non- 
negative function is completely additive and absolutely continuous, 
like the integral of a bounded function. 

6. If @ is a set of measure zero, the integral of f(P) is zero. The proof 
is the same as in [49]. 

7. Integrals of functions equivalent on % are equal. 
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8. If the integral of {(P) vanishes, the function is equivalent to zero. 
9.Iff,(P) < f,(P) on & and f,(P) is summable, f,(P) is also summable 
and we have 


Sf, (P) G (d&) < ff, (P) (dé). (45) 
é € 


10. If w,(P) is a sequence of non-negative functions summable on & 
and 
Jw, (P)G(d%)> 0 as noo, 
é 


then w,(P)— 0 in measure on @. 

The proof of properties 7, 8 and 9 isthesame as in [49]. Inequality 
(45) can be written for [f,], and [f,]n, and we get (45) on passing to 
the limit as N —> cc. Property 10 is proved by reductio ad absurdum. 
If w,(P) does not > 0 in measure, there exists a 6 > 0 such that 
G(Z,) does not > 0, where 2, = @[w,(P) > 4]. Hence it follows that 
there exists a subsequence @,, such that G(Z%,,) > d, where d is a 
positive number. We have 


fon (P)G (dB) > § Om (P) G (AB) > 8G (Bry) > 64, 
é 


ény 


whence it follows that the integral on the left does not — 0, and this 
contradicts the hypothesis. 

We now turn to the definition of the Lebesgue—Stieltjes integral 
for an unbounded function that can change sign. The integral can 
be defined as above for negative (non-positive) functions. 


52. Functions of any sign. Let f(P) be a real measurable function 
given on a measurable set @ of finite measure, and taking values of 
either sign. We introduce the so-called positive and negative parts 
of {(P): 

0, if f(P) <0; 
~ f(P), if f(P) <0; 
0, if f(P)>0. 


(46) 
Pi=| 
This definition can be written alternatively as 


f(P=LYMP EMP) FP) = [MPF]. (46) 
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We can now write f(P) as the difference between two non-negative 
functions: 


1(P) =f (P)—f (P). 


DEFINITION. A function f(P) is said to be summable on & if f* (P) 
and {-(P) are summable on &@. The integral of f(P) ts now given by 


Sf (P)G (az) = S7* (P)@ (a8) — s f- (P)G (dé). (47) 

Notice that, if only one of the functions {+ (P) or f-(P) is summable, 
the last formula gives a definite though infinite value for the integral 
of f(P). For instance, if f*(P) is summable, whilst f-(P) is not, the 
integral of f(P) is equal to (—°). 

THEOREM. The necessary and sufficient condition for f(P) to be 
summable on @ is that the non-negative function | f(P) | be summable 
on @. 

If {(P) is summable, f*(P) and f-(P) are summable, so that their 
sum |/(P)| =/f*(P) +/-(P) is also summable. Conversely, if the 
sum /{+(P) ++ {-(P) is summable, each term is summable by property 
9 of [51], so that f(P) is also summable. Notice that the division 
into positive and negative parts can also be performed for a bounded 
function, and (47) holds for the integral. In future, we shall often 
use the term ‘“‘summable function” for a bounded measurable function. 
We now turn to the basic properties of the integral of a summable 
function of any sign. These properties follow almost immediately 
from the analogous properties of the integrals of the non-negative 
functions f*(P) and f-(P). 

L.If f,(P) (kt =1,2,...,p) are summable functions, a linear 
combination of them with constant coefficients is a summable function 
and (13) holds. 

The summability of a linear combination follows at once from the 
inequality 


| Sate 


< Sleullfe(P)|: 
k=l 


the theorem proved above, and property 1 of [51]. To prove (13), 
we take separately the case of multiplying a function by a constant 
and the case of addition of two functions. Let {(P) besummable and c 
be a constant. We have to show that 


(cf (P) G (d&) = ¢ [ f (P)G (dB). 
é 


é 
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We assume c negative for definiteness. We now have (cf)* = 
= —cf- and (cf)- = —cft. Definition (47) gives us 
Sef G(d&) = —cf-G(d&) + cf f+ G (de) =c [ fG(d8), 
é é é é 
and the formula is proved. We now suppose that /,(P) and /,(P) are 
summable functions. We have to show that 
SA +h) G (d&) = ff, (d&) + ff. G (dé). (48) 
é g é 
We split the functions /,,/, and f=/,+/, into positive and 
negative parts: 


h=#-fs h=R-hi f= —-f. 
We have 
+h +CF=f+h +f. 
All the functions written are non-negative and summable. On apply- 
ing property 1 of [51], we get 


fit G(d&) + [ff G@(a%) + ff-E eZ) = 
é é é 


= Vf G(de) + S fz (de) + Sf G (de), 
é é é 
whence 
Jf 6 (de) — Sf @ (8) = 
= {ft G (ae) — Jf, G (d&) + ff? G (d&) — Sf; G (de), 
é é é é 
which proves (48). 

2. If f,(P) and f,(P) are summable on @ and f,(P) > f,(P), (15) 
holds. 

By property 1, the non-negative function /,(P) — /,(P) is summable 
and the integral of it (it is non-negative) is equal to the difference 
between the integrals of f,(P) and f,(P), which proves (15). 

3. If f(P) is summable, (16) holds. 

Inequality (16) is equivalent to the following obvious inequality: 


| § [f* (P) — fF (P) | (dé) | < {[ft (P) +f (P)] G (dé). 
é & 


4, If {(P) is summable on 2, it is summable on any measurable 
part &’ of &. 
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5. If {(P) is summable on @, and @ is divided into a finite or 
denumerable number of measurable sets ;, (20) holds. 

These last two properties follow from the fact that they hold for 
f*(P) and f-(P). 

6. If & is divided into a finite or denumerable number of measurable 
sets &,, {(P) is summable on each @; and the series 


> Sipe cas) (49) 
k=l &y 
is convergent, then /(P) is summable on @ and (20) holds. 

The non-negative | /(P) | is summable on all the &,, and it follows 
from the convergence of series (49), by property 4 of [51], that 
| {(P) | is summable on @, so that f(P) is also summable on @. After 
this, (20) follows from the last property. Notice that the convergence 
of series (43) is not sufficient for us to assert that {(P) is summable. 

7. If f(P) is summable on @, given any ¢ > 0 there exists an 7 > 0 
such that 

| Sf(P) @ (de) | <e 
é€ 


when e c @ and Ge) < 7. 

This property follows at once from the fact that it holds for ft (P) 
and {-(P). 

We have thus proved that the integral of any summable function 
is completely additive and absolutely continuous. 

8. If & is a set of measure zero, the integral of any f(P) over & 
is zero. 

9. The integrals of functions equivalent on @ are equal. 

Both properties are the result of the analogous properties for ft (P) 
and f-(P); we have to notice here that, if two functions are equivalent, 
their positive and negative parts are equivalent. 

10. If f(P) is measurable on @, F(P) is measurable, non-negative 
and summable on @ and | f(P) | < F(P), f(P) is summable, and we 


have 


| f(P) G(d&)| < { F(P)G (dé). (50) 
é € 


We can say, by property 9 of [51], that | /(P)| is summable, so 
that /(P) is also summable. Inequality (50) follows at once from 
property 3 and property 9 of [51]. An immediate consequence of 
what has been proved is that the product of a summable function 
and bounded measurable function is summable. 
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We must mention two further properties of the integral, which will 
be useful later. 

11. If f(P) is summable on a finite interval 4), and the integral 
of it over any interval 4 belonging to A, is zero, {(P) is equivalent 
to zero on Ay. 

We use reductio ad absurdum. If f(P) is not equivalent to zero, 
there exists a positive a such that one of the sets 4, [f(P) > a] or 
4, [f(P) < —a] has measure greater than zero. Let this be the 
first set, and let us write it as . We have 


{f(P)G(d&) > aG(%) > 0. 
é 


But there exist sets e, and e, of as small measure as desired such 
that + e, = R+e,, where # is an elementary figure, i.e. a finite 
sum of intervals with no common points. By hypothesis, the integral 
of {(P) over R must be zero, and we can write 


ae G (dé) = SIRENS = SENG dé 


Since the integral is absolutely continuous, the right-hand side 
can be made as small as desired in absolute value, whilst the left 
keeps a definite positive value. We have arrived at an absurdity, 
and our assertion is proved. 

12. If {(P) is summable on @ and satisfies the condition 


J p(P) f(P)G (dé) =0 (51) 
¢ 


for any choice of g(P), measurable and bounded on @, f(P) is equi- 
valent to zero. If (51) is satisfied for any choice of y(P), measurable 
and bounded on @ and such that 


Pg )G(d#) = 0, (52) 


}(P) is equivalent to a constant. 

Let @’ be the part of Z where f(P) > 0. We take as o(P) a function 
equal to unity on @’ and zero on & — &’. Condition (51) shows 
that the integral of {/t(P) over @ is zero, so that, by property 8 of 
[52], ft (P) must be equivalent to zero. Similarly, we can show that 
f-(P) is equivalent to zero, so that f(P) is equivalent to zero. 

We turn to the proof of the second part of the assertion. On writing 
kG(@) for the integral of f(P) over 2, by (52), {(P) — k also satisfies 
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condition (51), i.e. 
fp (P) [f(P) — k] G (d&) = 0. 
¢ 


Moreover, by the definition of k, we have for any choice of con- 
stant c: 


fel[f(P) — k] G (a) =0. (53) 
g 


Let y(P) be any measurable function bounded on @, and cg(é) 
the value of its integral. The function 9(P) = y(P) — c satisfies 
condition (52), and we have 


[vp (P) — [ft (P) — 4] @ (az) = 0 
é 


or, by (53): 
{py (P) [f (2) — 4] G (d&) = 0, 
é 


whence it follows, by what has been proved above, that /(P) — k 
is equivalent to zero, i.e. {(P) is equivalent to k. 

We conclude this section by considering the Lebesgue-—Stieltjes 
integral for a single variable. Let g(z) be a non-decreasing function, 
which is at the basis of the measurement, and f(x) be measurable 
with respect to g(z) and summable on the measurable set 2 of the 
X axis or on the interval [a, b] or on (a, b], etc.: 


Vf (x) dg (x) or { f(x) dg (x), or { f(x) dg (x) ete. 
¢ [a, 6} (a, b] 


If 9(z) = xz, we get the Lebesgue integral. In this case the measure 
of any point is zero, and it is of no importance whether or not the 
ends are associated with an interval; the integral over an interval is 
here usually written as 

f(x) da. 
a 

We are assuming a < 6. In addition, the following convention is 

adopted: 


a 


b 
Sf (x) da = — (f(x) de. 


6 


53. Complex summable functions. It is easy to define a summable 
function, and an integral, for a function /(P) that takes complex 
values. We split the function into real and imaginary parts: 


f(P) =A (P) + th (P). 
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We say that f(P) is summable if /,(P) and f,(P) are summable, 
and the integral of oe is defined as 


Sf(P)G@ =s f,(P)G (dé) +i § f,(P)G(dz). (54) 


The theorem proved above holds in this case: the necessary and 
sufficient condition for /(P) to be summable is that the modulus 
| /(P) | be summable. 

We notice first of all that, since f,(P) and /,(P) are measurable, 
the sum of their squares is measurable, so that the arithmetic value 
of the square root of this sum is measurable, i.e. | f| = /(f?-+ f3) is 
measurable, as follows at once from the formula 


SWVA+A>a)=S[R+h>@]. 
Further, it follows from the inequalities: 


IAI<VR+R |hl<VR+R VR+AR<IAI+IAI 


and properties 9 and 1 of [51] that the summability of | f, | and | f; | 
is equivalent to the summability of | / |, whence follows the assertion 
that we made above. 

Further, properties 1, 3, 4, 5, 6, 7, 8, 9 and 10 above still hold, and 
complex constant coefficients e, can be used when forming a linear 
combination of the f,(P). We shall only dwell on the proof of 
property 3: 

Pee) Gee) (55) 


The functions /,, f, and /(f? + f%) are summable, so that the sums 
oS, of, of , corresponding to the sequences of Lebesgue sub- 
divisions of these functions, tend to the respective integrals. If we 
take the sequence of subdivisions 6, = 69 6® 6®, the sums 3, for 
fy f, and )(f? + f2) will all the more tend to the respective integrals. 
If 6, is a subdivision of F into FY, and Pi? are any points of the 


&), we have 

| a [f. (PE?) + tf (PY? )] G (FP) | < 2 | fr (PY) + tf (PP) |G (FP), 
and (55) is obtained in the limit. 

54. Passage to the limit under the integral sign. Let us prove some 


theorems on passage to the limit under the integral sign for summable 
functions. 
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THEOREM I. If f,(P) is a sequence of functions, summable on a set & 
of finite measure, where 


[fn (P) |< F(P) (56) 


for all these functions, F(P) is summable on @, and f,(P) — f{(P) almost 
everywhere on &, then f(P) is summable on 2, and 


lim {f,(P) G (d&) = { f(P)G (dé). (57) 
¢ 


N-voo F 


It follows from the hypotheses that the limit function satisfies 
almost everywhere on @ the inequality 


[f(P)| < F(P). (56;) 


By passing to the equivalent function, we can assume that this 
inequality is satisfied everywhere on %. By property 10 of [52], 
fra(P) and f(P) are summable on @ and hence have finite values 
almost everywhere on 2. We take the integral of f(P) — f,(P) and 
apply property 10 of [51]: 


SUP ENG CEN SEN tne G(ae 2 (68) 
Let « be a given positive number, and @, the sets belonging to & 
about which we spoke in Theorem 6 of [44]. We have, by this theorem, 
G(,)—>0 and |f(P) —f,(P)|<e, if PEF—F,. (59) 
In addition, we have at any point P of @: 
Lf(P) — fa (P)| <P (P)| + [fa (P)| < 2F (P). (60) 
We split the integral on the right-hand side of (58) into two: 
Vif CP) — fa (P) |G (dB) = 
; = SAP) —f(P) O08) +f Uf (P) —fa( PIG (28), 


e—é, 
Hence, by (59) and (60): 
S|f(P) ~fa(P)|G(d&) < 2 f F(P)G(d&) + eG (F —Z,). 
é€ en 


or all the more 


{{f(P) —fa(P)|G (dF) <2) F(P)G(d%)+eG(F). (61) 
¢ x 
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Since G(@,) > 0 and the integral of F(P) is absolutely continuous, 
it follows that there exists an N such that 


{ F(P)G(de) <« forn>N. 
en 


and, by (61): 
(If (P) —f(P)|G(d%) < [24+ G(@)Je for n>N. 
$ 


Comparison with (58) gives us 


| Sf (P) G (d&) — ff, (P)G(aF)| < [2+ G(F)]e, 
¢ ¢ 


whence (57) follows, since ¢ is arbitrary. As when proving property 
15 of [49], it is sufficient to assume that inequality (56) is satisfied 
almost everywhere on @. 

Note. It may easily be seen that we have only used the con- 
vergence in measure /,(P)—> /(P), so that this convergence almost 
everywhere can be replaced in the statement of the theorem by 
convergence in measure. 

THEOREM 2. If f,(P) is a non-decreasing sequence of functions, 
summable on a set & of finite measure, the integral over & of the limit 
function {(P) is finite or (+°°), and (57) holds. 

The summable functions /,(P) are finite almost everywhere on @, 
and the non-decreasing sequence f,(P) has a limit at every point, 
which may be equal to (+). Let us consider the non-decreasing 
sequence of functions /,(P) — f,(P). We obviously have 


0<fn(P) —f(P) <f(P) -—A(P). 


If the non-negative function /(P) — f,(P) is summable over @, 
/(P) is also summable. The difference /(P) — /,(P) can play the role 
of F(P) in Theorem 1, and application of this theorem gives 

lim {[f,(P) — f, (P)] 6 (a) = J [fF (P) — f, (P)] @ (de). 
N-wce § ¢ 

Addition of the integral of /,(P) to both sides gives (57). Now let 
the integral of f(P) — f,(P) be (+°°). Now, since /,(P) is summable, 
the integral of f(P) is also (-++°°). Notice further that, if the sequence 
gr(P) tends to o(P) almost everywhere, [9,(P)]n — [¢(P)]n almost 
everywhere for any NV. 

This can be proved simply by remarking that, if y,(P)—> p(P) at 
some point, [p”(P)]~ > [¢(P)]n at this point. This is easily seen 
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if we consider separately the cases y(P) < N and g(P) > N. Thus 
the non-decreasing sequence of non-negative functions [/,(P) — 
— f,(P)]n tends almost everywhere to [/(P) —/,(P)]y. The limit 
function is bounded and all the more summable. By what has been 
proved: 
lim { [fn(P) — fi (P)|n @ (d&) = ff (P) A, (P)|w @ %). (62) 
Noo & é 
Let K be any given positive number. Since the integral of [/(P) — 
— f,(P)] is equal to (+°°), we can fix an N such that the integral 
on the right-hand side of (62) is greater than K. Hence we have, 
by (62), for all sufficiently large n: 


S(t. (P) — fi (Pil G42) > K, 
€ 


and all the more: 
[Ufa (P) — f, (P)] G (d&) > K. 
& 


Since K is arbitrary, it follows that 
lim | fy (P) @(d&) — § f, (P) @ (dB)] = + 00, 
J é 


TI->00 


lim {f,(P)@(d&) = + 0, 
N-»roco & 
adn (57) is proved for the case when the integral of f(P) is (+69). 
Note. A similar theorem holds for a decreasing sequence of 
summable functions, except that the integral of the limit function 
may be (—°°) instead of (+00). If f,(P) is a decreasing sequence, 
we obtain an increasing sequence on putting 9, = —f,, and the 
minus sign can be taken outside the integral. 
The following corollary of the above theorem is important for 
what follows. 
THEOREM 3. If the functions u,(P) (k = 1, 2, 3, ...) are non-negative 
and summable on @, and the series with non-negative terms: 


ss § % (P) G (d&) (63) 
k=1 ¢ 
as convergent, the series 
Su, (P) (64) 


k=1 
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is convergent almost everywhere on &, and u;(P)—> 0 almost everywhere 
on @. 

We consider the non-decreasing sequence of non-negative functions, 
summable on @: 


fa(P) = S uy (P) 
k=1 


and apply the previous theorem to this sequence. Since series (63) 
is convergent, the integrals of the /,(P) have a finite limit as n increases 
indefinitely. Thus the limit function, here expressed by series (64): 


is summable on @, and hence has finite values almost everywhere 
on @, i.e. series (64) is in fact convergent almost everywhere on @. 
But the terms of a convergent series tend to zero on moving away 
indefinitely from the initial term, i.e. u,(P)—+ 0 almost everywhere 
on @, and the theorem is fully proved. 

THeEoreM 4. If f,(P) ts a sequence of non-negative functions, summable 
on @, that tends almost everywhere on € to a limit function f(P), and 
the integrals of the f,(P) do not exceed some number A for any n, ie. 


§In(P) G (Ab) < A, 
¢ 


then f(P) ts summable on 2, and we have 
Sf(P)G (dé) < A. (65) 
e 


We have the inequality: 
S[faln G (d&) < ff,G (dF) < A (66) 
é é 


and, by property 15 of [49], with the number N playing the role of L, 
we can write 


tim f Ualw @ (88) = § [fw (28). 


On passing to the limit as n-> oo in inequality (66), we get 
S [fly G(d@) < A, 
€ 


whence it follows that f(P) is summable, and (65) is obtained as 
N—> oo, 
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55. The class Z,. We shall consider in the present section a class 
of measurable functions. This class plays an important part in 
applications of our present theory to various problems of mathematics 
and mathematical physics. 

Derinirion. A real function {(P), measurable on a measurable set F 
of finite measure, is said to be square summable on & if tts square f?(P) 
is summable on @, i.e. if 


SP(P)G (de) < +. 
¢ 


The class of functions square summable on @ is written symbolically 
as L9. The symbol L, is used for the case of the Lebesgue integral, 
i.e. when G(4) is the area of the interval 4. We shall in future simply 
write L, instead of L§ for the sake of simplicity. But it must be borne 
in mind that everything said below holds for any choice of G(J4). 
We now prove a number of properties of the class L,. 

THEoREM 1. Jf {(P) and g(P) € L,, f(P) and f(P) g(P) are summable 
on &, 


These assertions follow at once from the inequalities 
1 1™ ; 
MI <FO+Ps [fol< ze t @) 


and properties 1 and 10 of [52]. It must be mentioned that real 
functions are understood here and throughout what follows. 
THEOREM 2. If f(P) and o(P)¢€ L,, cf(P) and f(P)+ g(P) also 
belong to L,. 
The statement regarding cf(P) is obvious, whilst it follows for 
{(P) + g(P) from the formula 


(f+9)P? =P + 2f9+97 
together with Theorem 1 and property 1 of [52]. 
THEOREM 3. If f and g € Ly, the (Buniakowski-Schwartz) inequality 


holds: 
[Sig G(d&) 2 < [PE (dF) fg? G (de). (67) 
é 


é é 


The proof is precisely the same as for the Riemann integral. We shall 
repeat it. It must be remarked first of all that, if the coefficients 
are real and a > 0 in the quadratic form au? + 2bu + c, the identity 


au? + 2bu + ¢= a [(au + b)? + (ac — b*)] 


implies that 6? < ac if our quadratic form has non-negative values 
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for all real uw. We assume that f and g are not equivalent to zero, 
since otherwise (67) is trivial due to its left-hand side being zero. 
We write the obvious equation 

f (fu + 9)2G (a8) = ut {PG (de) + 2u f fg (dF) + [gC (a8), 

é é é g 
where uw is a parameter. The integral on the left has a non-negative 
value for any real uw. Hence the quadratic form on the right also has 
this property. We must have b? < ac for this quadratic form, which 
leads to inequality (67). Notice that the coefficient a in the quadratic 
form is certainly positive, since f is not equivalent to zero. 

CornotLary. If f € Z,, obviously |f| € £,, and by writing |/f| as 
[fl=I17f]-1, we get the inequality: 


| #G(d#)| < [| f| G(d&) < V (PE (a®)-G(B). (68) 
& é 


¢ 
THroreM 4. If f and g € L,, we have 


VS i+9)?G de) <) i PG(d&) + V fg? G (de). (69) 
& & & 


We obtain from (67): 
{fg G(a&) <V f f2G (dz) V [9?G (a). 
é ¢ é€ 


We multiply both sides of this inequality by two and add the integral 
of f*? and the integral of g? to both sides of the inequality obtained. 
The resulting inequality: 


SPE (av) + 2) fyG (AF) + [PG (de) < 
€ & ¢ 


< [PG (as) + 2V [PE (de) Vf yO (AF) + § 9G (de) 
& e & ¢ 
can be written as 


S(f+9G@ (ae) <[V SPE dF) +) (9°G (dF), 
€ ¢ 


¢ 


which leads us directly to (69). 
Notice further that, if /(P) € Z,, {(P) takes finite values almost 
everywhere on &, due to the summability of /*(P). 
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56. Convergence in the mean. We now introduce a new type of 
convergence in class Ly. 

Derrinrrion. We say that a sequence of functions },(P) of L, is 
convergent in the mean to a function f(P) of L, or simply that it ts con- 
vergent in L, to f(P) if 

lim {[f(P) — f, (P)]? @ (d&) = 0. (70) 
N—poo & 

It must be noticed first of all that, if we replace {(P) by an equi- 
valent function g(P), the integral in (70) is unchanged, and g(P) is 
also a limit in the mean of the /,(P). In future, we shall regard equi- 
valent functions of DL, as the same function. We now prove that the 
limit is unique, i.e. that the following theorem holds. 

THEOREM 5. If a sequence f,(P) of Lis convergent in L,to two functions 
}(P) and g(P), these functions are equivalent. 

We write the obvious equation 


f-—g9=(f—fn) + (fn — 9) 
and apply inequality (69) to the right-hand side: 


V f(f— 9G (a8) < Vf F—f,)? G(d8) + VI (9 — fry? G (48). 
¢ ¢ é 


As n—» °°, the right-hand side tends to zero, and the left is inde- 

pendent of n, ie. we have 
f(f— 9G (dB) =0. 
é 

It follows from this, by property 8 of [52], that f — g is equivalent 
to zero, so that f and g are equivalent. This theorem establishes the 
uniqueness of the limit of the f,(P) in Z,, though obviously, not 
every sequence of functions has a limit in the mean. It should be 
noticed that convergence almost everywhere does not imply conver- 
gence in the mean, and convergence in the mean does not imply 
convergence almost everywhere. We prove the following theorem in 
connection with this. 

THEOREM 6. If a sequence f,(P) of L, tends to {(P) in the mean on @, 
we can extract a subsequence fn,(P), which is convergent almost everywhere 
to f(P) on @. 

It follows from (70), by property 10 of [51], that f,(P)— f(P) 
in measure on @, and Theorem 6 is a consequence of Theorem 7 
of [44]. 

CoroLiary. If f,(P) tend to f(P) in the mean and tend to ¢(P) almost 
everywhere on &, then y{P) and f(P) are equivalent on @. If fp(P) > 9(P) 
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almost everywhere, then all the more fn,(P) — o(P) almost everywhere. 
But, as we have seen, f,,(P)— f(P) almost everywhere, whence tt fol- 
lows that g(P) and f(P) are equivalent. 

A necessary and sufficient condition can be established for con- 
vergence in the mean, analogous to the Cauchy condition for the 
existence of a limit of a numerical sequence [I; 36]. As a preliminary, 
we introduce a new definition. 

DEFINITION. We say that a sequence f,(P) of functions of L, is 
mutually convergent in the mean if, given any positive e, there exists 
an N such that 


§ (fn — fm)? G (dF) < ¢ for n and m>N. (71) 
é 


THEOREM 7. The necessary condition for a sequence f,(P) to be con- 
vergent in the mean to some function of L, is that it be mutually con- 
vergent in the mean. 

We are given that the sequence is convergent in the mean to a 
function /(P). We write /,(P) — fm(P) as 


fn — Im = (Un —f + Ff — Fn)- 


and apply inequality (69): 


V6 tn — fm)? G (AF) <V FF — fr)? @ AF) + VU (F — fn)? @ (dB). 
¢ ¢ ¢ 


Let « be a given positive number. In view of the convergence in 
the mean to f(P), there exists an N such that, for n and m> N, 
the integrals under the square roots signs on the right-hand side 
are < e/4. Our inequality now leads at once to (71), and the theorem 
is proved. We now prove the converse. 

THEOREM 8. A sufficient condition for a sequence f,(P) to be con- 
vergent in the mean to some function ts that it be mutually convergent 
in the mean. 

Given that /,(P) is mutually convergent, we have to show that it 
is convergent in the mean to some function. Since it is mutually 
convergent, there exists an increasing sequence of subscripts n, < 
<M, <M, <<... such that 


[Gress (2) — fou BIE (dB) < ay 
é 
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On applying inequality (67) with f = |fn,,,— fn, | and g = 1, we 
get 


S Vhs, (P) — fn, (P) G (dé) < 
& 
<V § (fms. (P) — fa (PIP G (dF) V [@ (de), 
é & 


or, by the previous inequality: 
loys (P) — fa (P)| @ (48) < op VOR). 
z 
Hence follows the convergence of the series 
SS § | frag (P) — fm (P)|G (a8), 
k=l é 
and, by Theorem 3 of [54], the series 
Sra (P) ~ fra PY 
is convergent almost everywhere on @. All the more, the series 
InsP) + [frag (P) ~ fra?) 


is convergent almost everywhere, the sum of the first p terms being 
equal to fn,(P); ie. the sequence 


bale EPs he lP yes 


is convergent almost everywhere on @ to a function /(P) with finite 
values. Let us show that f(P) € £, and that /,(P) is convergent in 
the mean to f(P). Since the sequence /f,(P) is mutually convergent, 
given any positive «, there exists an N such that 


5 [fre (P) — f,(P)]?G (d%) <e for n, and n> WN. 
We let n, tend to infinity in this inequality and use Theorem 4 of 
[54], which gives us 
STF (P) — fa (P)P G (dF) < ¢ for n> WN. (72) 
é 


Hence it follows that f(P) — fn(P) € Z,. But f,(P) also belongs to 
[,. On adding f(P) — fn(P) and fn(P), we find, by Theorem 2, that 
{(P) € Z,. Inequality (72) shows finally that /,(P) tends to f(P) in 
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the mean. The last two theorems lead to the following: the necessary 
and sufficient condition for the sequence /,(P) to be mutually con- 
vergent in the mean is that the sequence be convergent in the mean 
to some function. 

THEorEM 9. If f,(P) and gn(P) € L, and f,(P)—> f(P), gn(P) > 9(P) 
in the mean, then 


j fa (P) Gn (P) G (42) > J f(P)g(P)G(d@). 


On using the notation for any two functions g(P) and y(P) of L, 
[IV; 35]: 
= Se(P)v(P)G (dé), 
é 


we can write the Buniakowski inequality as 
(p, vp)? <(p 7) (YP ¥)- 
We now put 
fn(P) — F(P) = n(P)s Gn (P) — 9 (P) = Yn (P)- 
By hypothesis, (9; ¢n) and (Yn, Yn) > 0. We form the difference 
(1,9) — ns Gn) = (f9) — f+ Pm 9 + Pa) = 


== G; Vn) J (Pn: g) a (Pns Wn) 
whence 


| (f:9) — far Gn) | <| A Pn) | + (Par 9) | + ns Pa) | < 
< VAD Vn Pa) $V Gn Fr) VG9) +V Gn» Pn) Vn» Pn) - 


The right-hand side tends to zero as n-» °%, whence | (/,g) — 
— (fns Gn) | > 0, ie. (fas Gn) —> (7, 9), Which is what we set out to prove. 


57. Hilbert function space. Like the family C of [14], the family 
of functions of ZL, forms a function space. An element of this space 
is a real function, square summable on 2. Equivalent functions are 
identified here, i.e. they correspond to the same element of Z,. 

Addition of elements and multiplication by a real number can be 
defined, the operations being subject to the ordinary laws of algebra. 
The norm of an element x (the length of a vector) is defined as the 
non-negative number given by 


|F(P) l= VS PP eae). (73) 
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We say that a sequence of elements /,(P) of LZ, is convergent to 
an element /(P) of ZL, if || f(P) — fa(P) || > 0 as 2— oo. By (73), 
this convergence in norm is equivalent to convergence in the mean. 

The scalar product of two elements /(P) and g(P) can also be defined. 
It is given by 


(£9) =SfgG (de), (74) 
and we obviously have ‘ 
IF=V GA. (75) 
The distance between two elements f and g is given by 


e(f.9)=llf-—gll=VI¢—92 GF) =Vif—g.f—g)- (76) 
¢ 


Given three elements /,gand h, we can write {f — A= (f—g)+(g—/;) 
and apply (69). We thus obtain, using definition (74), the so-called 
triangle rule: 


e(f,h) <e(f.g)+e9,h). | (77) 


The zero or zero element of the space is defined as the function 
identically equal to zero on @, or, what amounts to the same thing, 
the function equivalent to zero. The norm of the zero element is 
zero, whilst the norm of any other element is positive, by property 8 
of [51]. The distance o(f,g) > 0, and the equals sign only holds 
when the elements are the same, i.e. f and g are equivalent functions. 
The distance and scalar product are symmetrical, i.e. o(9, f) = o(f, g) 
and (g,/) = (f,g). The necessary and sufficient condition for a 
sequence of elements to have a limit in our functional space is that 
it be mutually convergent, i.e. given any e, there exists an N such 
that || fm — fn || < « for n and m > N. This last property is usually 
described by saying that the space L, is complete. 

Inequality (77) obviously holds for any finite number of terms: 


0 (fis fm) < @ (fi fo) + @ (far fs) + +--+ +0 (fm—a1 fm) 
or 


If. — fall < MA — fall + Ufa > fall + oe [fea — fal (77) 


An essentially similar space L, can be formed for the complex 
functions (54). The function f(P) = f,(P) + ¢f,(P) is said to belong 
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to L, if f,(P) and f,(P) belong to Z,. The square of the modulus | f(P) |? 
is a summable function. Theorems 1 and 2 are retained. Inequalities 
(67) and (69) are rewritten as 


| § fg G(d&) |? < S lf? @ (d&) § |g |? G (dz) 
€ é & 


ee ee ae (78) 
VS\ftgoP@ de) <V (|fP@ (de) +) fig Pe (de). 
¢ é ¢ 


In the definitions of convergence in the mean and mutual con- 
vergence (f — f,)? and (fn — fm)? have to be replaced by |f/ — f, |? 
and | fn — fm |®. Theorems 7 and 8 are retained. Multiplication by a 
complex as well as a real number is permissible when forming the 
functional space. The norm of an element is given by 


WFl=V Sif Re ae), (79) 
e 
and the scalar product by 
(1,9) =§ f9@ (de), (80) 
é 


where a denotes as usual the complex conjugate of a. Formula (77) 
holds as before. The distance between two elements is given by (76) 
with (f — g)? replaced by | f — g|?, and it has the same properties 
as in a real space. We have for the scalar product: (9, /) = (f, 9). 
Everything said about the complex function space follows at once 
from the fact that functions /,(P) and f,(P) belong to the real space L,. 
The functional space ZL, is often called a Hilbert functional space. 

A particular case must be mentioned. Suppose that the function 
G(@) corresponds to concentrated masses located at the points 
P,, P,, ...,Pm and equal to unity. In this case the Lebesgue— 
Stieltjes integral of any function /(P) taking finite values at the above 
points over any set % containing the points degenerates to a finite 
sum: 


(HP) Gas) = Sf (Py. 
¢ k=l 


If we regard the values of any f(P) at the P, (k =1,...,m) as 
the components of an m-dimensional complex vector, we obtain an 
m-dimensional space 2, the theory of which was discussed in volume 
III [III]; 25]. The definitions just given of addition, multiplication by a 
number, norm, scalar product etc. are the same as those discussed 
earlier. 
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58. Orthogonal systems of functions. The theory of orthogonal 
systems of functions is directly connected with the functional space 
L,. This theory has already been described [IV; 38, 80]. We shall 
supplement the previous treatment by bringing in Lebesgue— 
Stieltjes or Lebesgue integrals. We shall start by discussing real 
functions. 

DEFINITION. We say that the functions 


PilP), @(P),---; (81) 


given on a measurable set F of finite measure and belonging to L,, form 
an orthogonal and normed [or orthonormal] system if we have 
0 for kl 


G (dé , 82 
Jmlh NCP? pera ve 


Given any function /(P) of Z,, we can form its Fourier coefficients 
with respect to system (81): 


a, =i ) ~,(P) @ (dé), (83) 


and its Fourier series: 


4,9, (P). (84) 


We cannot say anything about the convergence of this series, but 
we can form the segment of the series: 


Si (f) = ee (P). (85) 
k=1 
The expression 
fF [F(P) — DS bn oe (PI? E (dF) (86) 
é k=1 


has a least value if the coefficients b, are taken equal to the Fourier - 
coefficients ay. In this case we get the following simple formula for 
expression (86): 


n 
SUP) — 8, (PF O(a) = fF (P)G AS) — Sab, (87) 
k=1 
from which Bessel’s inequality follows: 


sa< srr )G(d®) (88) 
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and the convergence of the series on the left-hand side of this inequality. 
If the = sign holds in (88), the resulting formula: 


[P(P)G (ds) = Saj (89) 

é k=l 

is called the closure equation. By (87), the closure equation is equi- 

valent to the fact that the segments of the Fourier series S,,(/) tend 

in the mean to /(P). We now prove the following fundamental theorem. 

THEOREM 1 (Riesz—Fischer). If cy, ts any given sequence of real 
numbers, the squares of which form a convergent series: 


Sa<to, (90) 
n=l 
there exists a unique function of L, for which the cp, are the Fourier 
coefficients with respect to system (81) and for which the closure equation 
(89) holds. 
We form the functions 


8,(P) = > cx (P).- (91) 
k=l 


Since system (81) is orthonormal], we have 
J [Sa(P) — Sp(P)}P@ (dF) = cha. + Gag + -.. +03, (g>p) 
¢ 
and the convergence of series (90) implies that the right-hand side 
of the last formula tends to zero on indefinite increase of p, i.e. the 
sequence of functions (91) of Z, is mutually convergent. Thus a 
function f(P) of L, exists, to which S,(P) is convergent in the mean: 
lim { [f(P) — 8, (P)]}?@(d@) = 0. (92) 
Nn-roo & 
Let us show that the c, are the Fourier coefficients a; of this function. 
On taking (83) into account, as also that system (81) is orthonormal, 
we can write 


§ [f(P) — Sa (P)PE (AZ) = 
é 


n 
=e 


Vz: 


=[J PP) Gide) — Sat] + S(— ap. (98) 


~ 
{I 
- 


k= k 


The difference in square brackets on the right-hand side is non- 
negative by Bessel’s inequality. The remaining terms on the right- 
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hand side are also non-negative. As n —> ©, the left-hand side tends 
to zero, so that the same can be said of the right-hand side. It follows 
from this that each of the non-negative terms (c, — a,)* is equal to 
zero, i.e. Cc, = @,, which is what we set out to prove. Thus functions 
(91) are segments of the Fourier series of /(P) and it follows from 
(92) that the closure equation holds for /(P). It remains to show that 
the /(P) with the properties mentioned is unique. If a function g(P) 
exists with these properties, (91) are segments of the Fourier series 
both for f(P) and for g(P), i.e. the sequence S,(P) tends in the mean 
both to /(P) and to g(P). Since the limit in LZ, is unique, this means 
that f(P) and g(P) are equivalent, i.e. they represent the same element 
of L,, and the theorem is fully proved. We now define a closed system. 

DEFINITION. An orthonormal system (81) is said to be closed if the 
closure equation (89) holds for any function f(P) of Lg. 

We have not assumed in the proof of Theorem 1 that system (81) 
is closed. If this is the case, it is not necessary to stipulate that the 
closure equation holds for the function, since it holds for any function 
of LZ, by the definition of a closed system. Thus Theorem 1 is stated 
as follows for a closed system. 

THEOREM 1. If system (81) is closed and c, is any given sequence 
of real numbers, for which series (90) ts convergent, there exists a unique 
function of L, for which the cy, are the Fourier coefficients. 

In addition to a closed system, we can define a complete system. 

Derinition. System (81) is said to be complete, if there exists no 
function in L,, not zero (t.e. not equivalent to zero) and orthogonal to 
all the y,(P). 

We now show that the concepts of closure and completeness are 
equivalent. 

THEOREM 2. The necessary and sufficient condition for system (81) 
to be complete is that tt be closed. 

We use reductio ad absurdum for the necessity. Let system (81) 
be complete and non-closed, i.e. there exists a function h(P) of L, 
with Fourier coefficients a; such that 


jh (P)G(d&) > Sah. (94) 
¢ k=1 


On the other hand, by Theorem 1, an f(P) of L, exists, with the 
same Fourier coefficients a,, for which the closure equation (89) holds. 
On comparing this formula with (94), we get 


[12 (P)@ (dB) > f PP (P) @ (dé). (95) 
é ¢ 
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But /(P) — A(P) has Fourier coefficients which are all zero, i.e. 
it is orthogonal to all the ¢,(P); since the system of these is complete, 
/(P) — A(P) is equivalent to zero, which contradicts (95), ie. the 
necessity is proved. Let us prove the sufficiency. Given that the 
system is closed, we have to show that it is complete, i.e. that if all 
the Fourier coefficients of a function f(P) are zero, {(P) is equivalent 
to zero. Since the system is closed, we can write (89) for f(P), which 
gives us, since all the Fourier coefficients of {(P) are zero: 


{ P(P) @(d&) = 0, 
€ 


whence it follows, by property 8 of [51], that f(P) is equivalent to 
zero. Notice that the orthogonalization process that we described in 
[IV; 38] can be applied for any system of functions y,(P) of Ly. 

Everything said above can be extended at once to the case of 
complex functions of L,. The fact that system (81) is orthonormal 
is now expressed by the equations 


—_.- 0 for k4l 
P) 9, (P) 4 (dé) = , 96 
Sre(Pye Prada) = [tr et (96) 
whilst the Fourier coefficients are defined by 
a, = | f(P) 9, (P)G (dé). (97) 
é 


In further formulae, we always have to write the square of the 
modulus instead of the square of a function or number. For instance, 
the closure equation takes the form 


oo 


SFP) PE GF) = Sax. (98) 
é =i 

The above theorems are retained, except that we have to consider 
the series of | c, |? instead of series (90). 

We must also deduce the so-called generalized closure equation. 
Let a, and b, be the Fourier coefficients of /(P) and g(P), and let 
system (81) be closed. The function {(P) -++ g(P) has Fourier coefficients 
Gn + bn, whilst {(P) + ig(P) has coefficients a, -+ ib,. The closure 
equations for these are 


Slf+gP@ (de) = S Ja, + 2,12, 
¢ n=1 


Slf-+ Ph @ (ds) = S|a, + 2d, [2, 
¢ 


’ 
a= 


- 
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or 


STP + lg? + dg + f9)] @ (de) = 


= Sarl? + [O12 + Gb, + a,5,)]; 
n=1 


f (fl? + lok +29 — fg] G (de) = 


= Lag? + (bal? +4 (a, 0, — an 8,)]- 
n=l 


On taking into account the closure equations for f and g, multiplying 
the second equation through by 7 and adding to the first, we get 
the generalized closure equation 


§ 19 G (dv) = Sa,b,. (99) 
¢ n=l 

In the ease of real functions the generalized equation becomes 
f {9G (a8) = Sa,b,. (100) 
¢ n=1 


An immediate consequence of the generalized closure equation is 
that the Fourier series of any function f(P) of ZL, can be integrated 
term by term over the set or any measurable part of it J’ [I]; 156]; 
in other words, if a, (k = 1, 2,...) are the Fourier coefficients of 
{(P), then 

SH(P) G (AF) = Sax J o%(P) GE (de). (101) 
e k=1 &’ 

Let us indicate a further property of space L,, which implies the 
existence in LZ, of a closed orthonormal system. This property is 
usually known as separability and consists in the following: there 
exists a denumerable set of elements y,(P) (k= 1, 2,...) of L,, 
dense in L,, i.e. such that, given any /(P) of L, and any positive e, 
there exists an element y,(P) of thie denumerable set such that 
{| f(P) — pn(P) || < «. We shall prove the separability of ZL, in a 
later section. We now show that separability implies the existence of 
a closed orthonormal system. On applying the orthogonalization 
process [IV; 38] to y»,(P), we obtain some orthonormal system 
oP) (k =1,2,...). Let us show that it is closed. By what has 
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been said, given any /(P) of L, and any positive e, there exists a 
wm(P) such that || f — ym || < e. But, by virtue of the orthogonaliza- 
tion process, y,,(P) is a finite linear combination of 9,(P), i.e. ym(P) = 
= ¢,91(P) + ¢,9(P) + ... + ¢:9(P), and thus 


b ! 
If Ym 2 = f [f(P) — > ex % (P)]? G (de) < e2, 
a kel 


If we replace the c, by the Fourier coefficients of /(P) with respect 
to system 9,(P), the inequality holds all the more [II, 148]: 


b 

S[F(P) — Sif]? @ (d&) < 2, 

a 
where S,(f) is the segment of the Fourier series of {(P). Since e is 
arbitrary, this inequality implies that the system ¢,(P) is closed. 

A further remark: when G(@) corresponds to concentrated masses 

at the points P,, P,,..., Pm, a closed system contains only m ele- 
ments, and this case is of no interest. As already mentioned, it reduces 
to a finite-dimensional space F,,. 


59. The space /,. The space /, of infinite sequences is closely con- 
nected with the space L,; here, we consider the complex case straight 
away. An element of J, is an infinite sequence of eomplex numbers 
% (X41, Lg, %3,.-.) such that the series of | z, |? is convergent. The 
definitions of multiplication of an element by a complex number and 
addition of elements are obvious. By definition, an element cx has 
coordinates (cxr,,c%,...) and the sum of elements z and y with 
coordinates x, and y, has coordinates 2, + y,; the convergence of the 
series of | 2, + y, |? is an immediate consequence of the convergence 
of the series of | z, |? and | y, |?, in view of the obvious inequality 
| n+ Yn |? < 2(| ap |? + | yn |*). The norm of an element 2 is given by 


lei=[/ S| x,|2, (102) 
n=l 
and the scalar product of elements z and y by 
(1.9) = SI» (103) 
n=l 


the absolute convergence of the series on the right being an immediate 


‘a 1 
consequence of | 2p Yn | < 2 (| tp |? + | Yn |). 


166 SET FUNCTIONS AND THE LEBESGUE INTEGRAL [59 


We have 
|| ||? = (x, a). (104) 


The distance between elements xz and y is given by 


e(z,y)=Ve—y,2—y) =|\z-y| =| Siem (105) 
n=l 
The following inequalities are precise analogues of (67) and (69): 


| S2-Gn|? < SF |enl?- > lyn! (106) 
msl n=1 1 


ees ee es es peered 
|| S2n-+ snl Steal + |] Sloat (107) 
n=l n=l n=l 
They are proved in the same way as (67) and (69). Notice that 
(106) can be written as 
| (x,y) |? < |]2 (Ply iP. (108) 
In view of (107), the triangle rule holds for distances. The zero 
element is the element all the coordinates of which are zero. We say 
that a sequence of elements x” is convergent to an element = if 
Ila — 2” |} +0. Let 2” be the coordinates of 2” and a, the 
coordinates of x. The convergence of 2 to x is equivalent to 


jz — a2 [2 = S[a,— 2 20 as N—> oo. (109) 
kes 


To discover the connection between spaces L, and J,, we take 
any closed orthonormal system (81). For every function f(P) 
of L, there will be a corresponding sequence of complex numbers 
a, — its Fourier coefficients, the series of | a, |? being convergent. 
Conversely, for every sequence of complex numbers there is a corres- 
ponding definite function of L,, by virtue of Theorem 1. Thus, by 
taking a closed system (81), we establish a one-to-one correspondence 
between the elements of Z, and J,. Each element of J, corresponds 
to one definite element of /,, and vice versa. Since the Fourier coef- 


m 
ficients of a finite linear combination of functions » c, f,(P) are equal 
kel 


to the corresponding linear combination of Fourier coefficients of the 
component functions f;,(P), we can say that our one-to-one corres- 
pondence is distributive, i.e. if elements f,(P) (&k = 1, 2, ...,m) of L, 


m 
correspond to elements x“ of J,, the element 3 cy, f;(P) corresponds 
k=l 
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m 

to the elements  c, 2“. By the generalized closure equation (99), 
k=1 

the scalar products of corresponding elements in Z, and 7, are the 


same. By the closure equation (98), the norms of elements are also 
the same. Spaces L, and 1, are different realizations of the same 
abstract space. We shall later investigate the properties of this 
abstract space and operators in it by describing the space with the 
aid of a system of axioms. The concept of mutual convergence in 
space 1, must also be mentioned. We say that a sequence of elements 
x is mutually convergent in 1, if, given any positive e, there exists 
an N such that || 2” — 2!” || < ¢ for m and n> WN. On taking 
into account the correspondence between spaces L, and l,, and Theorems 
7 and 8 of [56], we can say that the necessary and sufficient condition 
for a sequence x” to have a limit in J, is that it be mutually convergent. 
The limit is unique. 

We take the set K of elements of J, having only a finite number 
of non-zero coordinates, all these coordinates being rational complex 
numbers, i.e. numbers of the form a -+- 67, where a and 0 are rational 
real numbers. Since the set of rational numbers is denumerable, our 
set K is denumerable. Let us show that it is dense in 1,. Let x (2,, X, 


2, ..-) be an element of J, and 2(c,, c,, ..., ¢, 0,0, ...) be an element 
of the set K. We have 
n oo 
|e —2|P? = Slee—a |? + SS [ax|*- (110) 
k=l k=n+1 


Let « be a given positive number. Since the series of | zx |? is con- 
vergent, we can fix an n = n, such that 


> lue<se. 
k=ngtl 


In the finite sum 


Nhe 
> | ea = & |? 

k=l 

we can choose the rational numbers c, so close to zx, that this sum 
isalso < «7/2. We now have, by (110), [| 7 — 2 ||? < e,ie. ||2 —2z||< 


< «. This shows that the denumerable set K of elements of /, is dense 
in 1,. The space 1, is therefore separable. The elements of this space 


e,(1,0,0,...); @,(0,1,0,0,...); e3(0,0,1,0,...);.. 


form a closed orthonormal system. 
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60. Lineals in L,. We now introduce some new concepts in con- 
nection with Z,. 

DerFinirTion. A set U of elements of L, (of functions of L,) is called 
a lineal when the condition is satisfied: if o(P) and y(P) € U, then 
cg(P), where c is any real number, and g(P) + y(P) also belong to U. 

It follows from this definition that, if f,(P) ¢ U (k = 1, 2, ...,m), 
then ¢,f,;(P) + ef,(P) + ... + ¢mfm(P) € U for any choice of numbers 
cy. Let us notice some properties of lineals. Let M be a set of functions 
bounded on @ i.e. functions such that, given any /(P) € Af, there 
exists a number d, such that | f(P) | < d,on &@. The set If is a lineal 
in L,. 

We say that f(P) is continuous on a set @ when the condition is 
fulfilled: if points P and P, (n = 1, 2, ...) belong to # and P,, > P, 
then {(Pn) — f(P) [IV; 157]. The set of functions of LZ, continuous on & 
is obviously also a lineal. 

THEOREM ]. A lineal of functions of L, continuous on a bounded 
set @ is everywhere dense in L,. We have to show that, given any 
element /(P) of ZL, and any 6 > 0, there exists a function 9(P) € L,, 
continuous on @, such that 


lf — |? = SU (P) — op (PIP GE (AB) < 64, (111) 
¢ 


where the two-dimensional case is considered, as above, for the sake 
of clarity. We have /(P) = ft (P) — f-(P), where ft (P) and f-(P) are 
the positive and negative parts of {(P). These functions, which belong 
to Z, and hence are summable on @, can be assumed to take only 
finite values and to be the limit functions of increasing sequences 
of piecewise constant functions w7(P) and w;(P) with a finite number 
of values, where w;(P) < ft(P) and w;,(P) < f-(P) [46]. We have [54]: 


lim { [f+ (P) — wt (P)P G (d&) = 0 
and ne 
lim { [f- (P) — w, (P)VG (d&) = 0. (112) 


Nooo & 
Further, it follows from (x, + 2)? < at + a) that 
[f — (wx — wy) }? < 2 (fF — oF)? + 2(f— — on)’, 


and, on bringing in the piecewise constant function @,(P) = wn (P) — 
— w,(P) with a finite number of values, we can fix, in view of (112), 
an nsuch that || f — a, || < ep), where e, is any given positive number, 
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On observing that || f— |] < || f—o@nl| + ||@n—o||, we only 
need to show that there exists a continuous function o(P) such that 
||@ — || <e, where ¢ is any given positive number and @(P) is 
a given function with a finite number of values. Such a function can 
be written in the form 


where w,, (P) is the characteristic function of the fixed sets 7; belong- 
ing to @. If 9,(P) (k= 1,2,...,m) are functions continuous on 
S and 9(P) = ¢,9,(P) + cop(P) + .-. + ¢mn(P), then 

m 
lo — || < Dl eellion — el 
and the proof of the theorem reduces to the proof of the following: 
given the characteristic function w, (P) of any measurable set Fy 
belonging to 2, and any e > 0, there exists a function g(P) continuous 
on @ such that ||w—- g|| < «. We know that, given any «, > 0, 
there exists a closed set F belonging to 2, such that G(f, — F) < 
< ©% [35]. Now: 


|x, — Op |? = [[2,(P) - op(P)PG(d%)= § G(d&%)=GG,— F) <3 
é &.—F 


by virtue of the inequality || @,,— || < || as, — wr || + || or — 
— ||, it is sufficient to prove our last assertion for the characteristic 
function of a bounded closed set F belonging to 2. Let 7(P) denote 
the distance from the point P to the set F. We have 7(Q,) < 7(Q) + 
+ |QQ,! ond r(Q) < 7(Q,) + |9Qi|, where |Q@Q,| is the distance 
between @ and Q,, whence it follows that r(P) is continuous. Further 
r(Q) = 0, when and only when @ ¢€ F [II; 89]. It is easy to see that 
wp(P) is the limit of a non-increasing sequence of functions con- 


tinuous on @: 
i 


Pn (P) = Tp nr (P)’ (113) 


so that || wr — op, || > 0 [54] as n> ©, i.e. given any « > 0, there 
exists an n such that || or — 9, || < «, where ,(P) is continuous 
on @; the theorem is therefore proved. 

CoROLLARY 1. Let us confine ourselves for clarity to the case of 
a plane, and take @ as the closed interval A(a, < x < b,;a, < y < by). 
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Given any function g(x, y) continuous on 4, we can form a poly- 
nomial p(x, y) such that | g(z, y) — p(x, y) | < €9 on A, where ey is 
any given positive number. Now: 


lo — PIP = Sle 9) — pe, wr (4) < GA). 


Using our theorem and the fact that ||f—p|| <({f—g||+ 
+ ||~—p||, where f(z, y) € L, on A, we can say that a lineal of 
polynomials is everywhere dense in L, on A. 

Instead of the interval 4, we could have taken any bounded closed 
set F, since a function continuous on F' can be extended to a closed 
interval 4 containing F whilst preserving continuity [IV; 157] 
We have to take into account here that 


lp — pllk = fly (2, y) — p(w, y) PG (dF) < 
F 
< [[p (x, y) — p(x, y)}G (a4). 
4 


CoRoLLaRy 2. Since the rational numbers are everywhere dense 
on the real number axis, given any polynomial p(z, y) there exists 
a polynomial with rational coefficients g(z, y) such that, given any 
&) > 0, we have | p(a, y) — g(z, y) | < €) on A or F. 

It follows at once from this that the polynomials with rational 
coefficients form a set everywhere dense in L, on A or F. 

Let us show that the set of such polynomials is denumerable. 
We associate with each polynomial q(x, y) a positive number: o = 
=n-+r-+ 8, where 7 is the degree of g(x, y), r is the least common 
denominator of its coefficients (7 is taken as positive), and s is the 
sum of the absolute values of the numerators in its coefficients 
reduced to the denominator r (exception: if g(x, y) = 0, we associate 
with it o = 0). It is easily seen that the number of polynomials 
corresponding to the same ga is finite. We can enumerate all the 
polynomials with rational coefficients in order of increasing numbers 
o corresponding to them, the order of polynomials with the same ¢ 
being of no consequence. We see that there exists a denumerable 
set of elements of Z,, everywhere dense in J,, i.e. L, on A or F is 
separable. 

Now let & be any bounded measurable set. We take its closure &. 
This is a bounded closed set, and, as we have shown, there exists in 
L, on & a denumerable everywhere dense set 9,(2, y) (k = 1, 2, ...). 
These will be functions of Z, on &, and hence on &. Let us show 
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that 9,(z, y) are everywhere dense on 2. We take some function 
f(z, y) of L, on & and extend it by zero onto @. It will also be of L, 
on @, and hence, given any « > 0, a 9,(z, y) of the denumerable 
set mentioned can be found such that 


SUF (a, Y) — Pe (m, #)}? G (AF) = SUF (2 9) — oe (@, YP G (dB) + 


+ § [fle 9) — 9 (% WIPO (dB) <e 


Fe 
and all the more, 


SL H(t, y) — (2, YP G (AB) <e 


& 


and so on. The separability of Z, on any measurable set will be proved 
below. 

We shall prove a further theorem, which will be useful later. 

THEOREM 2. If the closure equation holds for every function of a 
set K, dense in L,, it also holds for any function of Ly. 

Let {(P) be an element of Z, and ¢ a given positive number. Since 
K is dense in L,, there is an element g(P) in K such that || f — || < ¢/8. 
By hypothesis, the closure equation holds for g(P), so that we can 
take a segment s,(y) of the Fourier series of g(P) with respect to the 
orthonormal system of functions g,(P) such that || ~ — 8,(¢) || < é/8. 

On taking into account the equation f — s,(f) = (f — 9) + (p — 
— 8n(¢)) + (8n(~) — 8,(f)) and the triangle rule, we get || f — s,(f) || < 
<I f—@il+ ily — slp) ll + li slp) — Sal(f) |], whence || f — 
— $,(f) || < 2¢/3 + || sn(~) — s,(f) ||. But the difference s,(~) — s,(f) 
between the segments of the Fourier series for y and / is the segment 
of the Fourier series for » — f, i.e. 8(p) — S”(f) = sn(p — f), and, 
by Bessel’s inequality, |! s,(¢) — s,(f) || < || —f || < ¢/8. Finally, 
Il f — 8n(f) || < 22/3 + e/3 = e, whence it follows, since « is arbitrary, 
that the closure holds for /(P), and the theorem is proved. 

Everything said above can be generalized at once to the case of 
complex functions of LZ, [58]. 


61. Examples of closed systems. We shall give some simple examples 
of orthonormal systems, closed in the finite interval [a, bd]. 
If we apply the orthogonalization process to non-negative integral 
powers of x: 1,2,27,... [IV; 38], we get a system of orthogonal 
polynomials p,(z) (k =0,1,2,...) on the interval [a,b], where 
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p(x) is of degree k. Every polynomial p(x) of degree n can be written 
as a linear combination 


p(xr) = SO, Dy (2). (114) 


To see this, it is sufficient to define c, in such a way that the coeffi- 
cient of x” on the right-hand side is the same as in p(x). We then 
have to define cp_, so that the coefficient of z"~* in the term ¢p_, Pn—y(2) 
is the same as in p(x) — cyp,(x), and so on. The coefficients c, in 
(114) are obviously equal to the Fourier coefficients of p(x) with 
respect to the p,(z). It follows from the exact equation (114) that, 
in the case of an orthogonal system p,({z) the closure equation holds 
for any polynomial p(x), so that, by Theorem 2 of the previous section, 
the system of orthogonal polynomials is closed. We have seen above 
that, on the interval [—/, +/], with the orthogonal system 


sin = ; cos (n= 0,1, 2,...) (115) 


the closure equation is fulfilled for any continuous function [II; 148], 
whence it follows that system (115) is closed in Z,. Similarly, the 
orthogonal systems of functions 
NM 
l 
are closed on the interval [0, 7]. 
We saw earlier [IV; 99], that in the case of the eigenfunctions 
px(z) (k = 1, 2, ...) of a boundary value problem, every function with 
y continuous derivatives up to the se- 
cond order and satisfying the boundary 
conditions can be expanded in a uni- 
yr formly convergent Fourier series in 
functions ,(x). The closure equation 
will hold all the more for such func- 
tions. By varying the value of the func- 
tion in narrow intervals close to the 
zr=a@ x ends of the interval, it may easily be 
Fic. 3 seen that the closure equations are 
observed for all functions with con- 
tinuous derivatives up to the second order, without requiring that 
the boundary conditions be satisfied at the ends. The closure equa- 
tion will be satisfied all the more for all polynomials, so that 
the system of eigenfunctions ¢,({x) is closed. 


NAL 


(n=1,2,...) and cos i 


sin (n=0,1,2,...). 
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62, The Hélder and Minkovskii inequalities. In addition to the 
class L,, a class often discussed is Lp, of measurable functions /(P) 
for which the pth power of the absolute value (or modulus for a 
complex function), i.e. | {(P) |?, is summable on @ [cf. 55]. We shall 
first deduce inequalities for sums and integrals analogous to (67) and 
(69), with any index p greater than unity. 

Let a be a positive number. We take the curve y = «2° on the XY 
plane and draw the straight lines =a and y = }, parallel to the 
axes (Fig. 3). These straight lines, the axes and the curve bound 
two plane domains, having the areas: 


a b 1 
- b 
0 0 


As is directly obvious from the figure, the sum of these areas is 
not less than the area ab of the rectangle with sides a and 8, i.e. 
1+a 
arr? b 2 


ab<s ae tide: 


On writing p=1+a and p’=1+ l/a, we can rewrite the 
inequality as 
; bp” 


A es (116) 
where p and p’ are obviously connected by the relationship 
1 1 
44 tan. (117) 


In view of the arbitrariness of the positive number a, inequality 
(116) holds for any positive p and p’ connected by (117). Both these 
numbers must evidently be greater than unity. If p = 2, then p’ = 2, 
and (116) reduces to the obvious inequality: 2ab < a? + b*. It follows 
from Fig. 3 that the = sign in (116) holds when and only when the 
point of intersection of x =a and y= 6 lies on y= 2", ie. when 
b = a’. Suppose further that the positive numbers aj, and b; (k = 
=1,2,...,) satisfy the relationships 


n n 
>a =1 and >, oP = 1. (118) 


k=l k=l 
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We put a = aj and b = By in (116). On summing over k and taking 
(117) and (118) into account, we get 


n 
dah <1. (119) 
k=! 
We now take any positive numbers a, and };, and write 
n 1 n 1 
A=| Sap; B= > i, (120) 
kat kal | 


The numbers a, =a,/A and by = b,/B obviously satisfy (118), 
and we thus have inequality (119) for them, whilst (119) can here 
be written as 


yy b, < AB, 
kat 


i.e. 
1 


Satp( Sop (121) 
k=l 


n 
> ar by < 
k=1 


On passing to the limit, we get a similar inequality for the infinite 
sums 


k=1 


co °° 1 ( 20 1 
Sad, < [Sarl | Soe’ (122) 

k=l k=l k=l 
on the assumption that the series on the right are convergent. In this 
case, the series on the left must be convergent, by virtue of the 
inequality. Certain of the a, and b, may be zero. In the case of complex 
numbers, we can use the obvious inequality 


[abe] < Si Loul |b 
k k 


to write (122) as ; 
Po < (1% nf (=! by ry (123) 
k k " 


These inequalities are usually known as the Hoélder inequalities 
for sums. When p = p’ = 2, they degenerate to the ordinary inequality 
(106) of [59]. Similar inequalities hold for integrals. Suppose that 
{(P) € Lp and g(P) € Lp. We have by (116): 


If(P)g(P)| < et ay LePe 


The right-hand side is summable by hypothesis, so that /(P) g(P) is 
summable, i.e. if {(P) € Ly and g(P) € Ly,, f(P) g(P) is summable 


62] THE HOLDER AND MINKOVSKII INEQUALITIES 175 


(cf. Theorem 1 of [55]). The integral of f(P) g(P) satisfies a Holder 
inequality analogous to (67) of [55]: 


i 


[fladzay] <(fflPadeau) (fai az ayy (124) 


this being written only for Lebesgue integrals (it is also true for 
Lebesgue-Stieltjes integrals). It is obtained in the usual way from 
(122) with the aid of a passage to the limit. Let 6, and 6) be inde- 
finitely diminishing sequences of Lebesgue subdivisions for | f | and 
|g | and 5, = 646% be the product of these subdivisions. Let J” be 
the component parts of the set % in the subdivision 6,, and min 
and m;, the strict upper bounds of |/| and |g| on & ”) On taking 
(117) into account, we can write 


1 1 
2 Min My, , Mm (EP) = Sm, ,m? (SP) mj, , mP’ (FP). 


We now apply Hélder’s inequality with a, = mj,,m"? (£{) and 
by, = mgm? (Ef): 


> Min Min ™ (SP) < (2 mp, m (By)? (= myP,m (ery . (125) 
k k 1A 


We write mz, for the strict upper bound of |/||g| on (Z%”). 
We obviously have mn < Min Min, and it follows from (125) that 


> Mr,n M(F) < ( Ps me, m Py 2 my, m (pyy ; 


On passing to the limit for the sequence of Lebesgue subdivisions, 
we get 


flilloldeay < (f|/P dzay) (f}g\P" dx ayy (126) 


whence (124) follows. 

Let us prove a further inequality, analogous to (69) of [55]. We first 
take the case of a sum. As above, let a; and b;, be sequences of positive 
numbers. On summing the obvious equation 


(Qj + Dy)? = (Ay + By)P~* ay + (ty + By)P-* By, 
we get 


D> (Gi + BaP IS (Gig A By )PE ie ES (Ga, A By)? 7? By. 


k k k 
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On applying Hélder’s inequality to the sum on the right, we arrive 
at the inequality 


a + by)? =(2:% fp (> (a+ bee ai 


+ (2 uy (= (dy, + Bj)” oy . 


But, by (117), p’ = p/(p — 1), and the last inequality can be 
rewritten as 


Ze +O)? < (Sle t bry > ot} oe ny 


On dividing both sides by the factor in front of the square bracket, 
we arrive at Minkovskii’s emia for a sum: 


(<> (+ Bn " <(Sap + (S4y. (127) 


k 


This inequality leads, precisely as above, to the Mirikovskii integral 
nequality for f(P) and g(P) € L,: 


() lf+qg|P dz dy) > (f1slrae ay? + (Jig ae ayy (128) 


we have to notice here that |f+g|<|/|-+]g |. Inequalities (127) 
and (128) have been deduced on the assumption that p > 1. They 
are obvious for p = 1, but cease to be valid for p < 1. 

By using the above inequalities, we can easily prove for the func- 
tion space L, (p> 1) the properties that we had earlier for J,, 
the functions here being assumed complex. Let us recapitulate these 
properties, in the order of [55). If f(P) € Lp and g(P) € Ly, (p > 1), 
f(P) and f(P)9(P) are summable on @. This follows from (124). 
If f(P) and g(P) € LZ, and c is a constant, then cf(P) and f(P) + 
+ 9(P) € Ly (p > 1). This follows from (128). A sequence of functions 
fn(P) of L, is said to be convergent in the mean in Ly (p > 1) or 
convergent in the mean with index p to the function f(P) of L, if 

hm {| f(P) — fn (P) |? G (d&) = 0. 
n+ & 

The limit in the mean in Z, is unique up to equivalent func- 
tions. If f,(P)—/(P) in the mean, a subsequence f,,(P) can be 
extracted from the sequence /,(P) such that it is convergent almost 
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everywhere on @ to f{(P). Mutual convergence is defined by a con- 
dition analogous to (72): 


Stn (P) — fm (P)|?G (dB) < 
& 


for n and m > N, and the necessary and sufficient condition that 
the sequence /,(P) be convergent in the mean to a function of L, 
(p > 1) is that it be mutually convergent in Ly. 

If fra(P) > f(P) in Lp and g,(P)— g(P) in Ly, (p > 1), then 


lim f f,(P) gn (P) G (d&) = § f(P)g (P) (dé). 
¢ 


M+ oog 


We can also introduce the norm in Lp (p > 1): 
2 
Ifll= (fap @caayy 
e 


and the distance between two elements 6(f, g) = || —g ||, where we 
have || cf(P) || = | ¢||| f(P) || and the triangle rule. 
Let us show further that, if g > p and f(P) € Ly, then f(P) € Ly. 
By hypothesis, 
SFP) PE 8) = A< foo. 


é 
We consider the integral: 


S\f(PyP@d%) = Jf |f(PyPp@d@%)+ ff  [f(PyPads)< 
é FU f(PIS) &(\F(P)|> 1) 


< f G(d&) + §|f(P) |G (dF) = G (8) + A, 
é é 


whence it follows that /(P) € Zp. In the proof, we have used the fact 
that the measure G(%) of the set 2 is finite. 

But in L, (with p # 2), we do not have the scalar product that 
we had in Z,. 

A space 1, can be formed in the same way as /,, in which the elements 
are infinite sequences of complex numbers (2,, 2, ..-) such that the 
series formed from |z; |? is convergent. It has properties analogous 
to those of 1, when p > 1, the resemblance being the same as that 
of L, to L,. There is no scalar product in 1, (p # 2), and the connection 
with L,,such as we established betweenL, and J,, is missing. Inequalities 
(106) and (107) are replaced by (122) and (127), in which a, = | 2x | 
and by = | ys |. 
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63. Integral over a set of infinite measure. We have so far con- 
sidered the integral over a measurable set Z of finite measure. The 
integral can be extended to the case of a set of infinite measure in 
much the same way as the Riemann integral was extended to the 
case of an infinite interval. Let {(P) be a measurable and non-negative 
function given on a measurable set % of infinite measure. We take 
an increasing infinite sequence of sets of finite measure 


6, CO, CCsC fas (129) 


for which @ is the limiting set. The sets , can be formed say from 
the products of the set @ with the intervals 4, (—-n <2 < +7; 
—n <y <-+n). The integrals 


Sf(P)G ds), (130) 
En 


exist for the bounded sets, and do not decrease as n increases because 
{(P) is non-negative. The limit of the monotonic sequence (130) is 
defined as the integral of /(P) over @: 
\ f(P)G (d&) = lim { f(P) G (dé). (131) 
& N-+oo &y 

Notice that integrals (180) may be equal to (+ °°). In this case 
the integral of {(P) over # is obviously also (+-°°). It may happen 
that all the integrals (130) are finite, whilst the integral over 2 is 
(+c). To justify the above definition of the integral, we have to 
show that the limit of the numerical sequence (130) does not depend 
on the choice of monotonic increasing sequence of sets Fp. 

THEOREM. Integrals (130) have the same limit whatever the choice 
of the increasing sequence of measurable sets &,, of finite measure tending 
to &. 

We use reductio ad absurdum. Let Jj < &3 < &3 Cc ... be another 
increasing sequence of sets of finite measure having @ as the limiting 
set and such that the sequences of integrals (130) have different 
limits for sets , and Z,: 

lim [ {(P)G(d%)=a and lim {f(P)G(d%) =b>a. (132) 


n+ &p nce én 
The number a is always finite, and we have 


SHP)G (de) <a (133) 
on 
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Suppose first that the number 0 is finite. 
Having chosen the positive number c < b — a, we can fix a value 
of the positive integer m such that 


{F(P)G (dB) >a+e. (134) 


é&m 


Since /(P) is non-negative, we have 


S f(P)G 8) <a. (135) 


é&m &y 


We consider the sets 77,2. They increase as n increases, and since 
& is the limiting set for %,, the limiting set for #/,%, will be 
Z%/,, whence it follows that 

lim G (@), —@mF,) = 0. (136) 
Since 6 is finite, /(P) is summable on 2}, and in view of (136) 
and the absolute continuity of the integral of f(P), we have 


lim § f(P)G(d&) = {f(P)G (dz), 
Em 


N-+ co oe En 


which contradicts inequalities (134) and (135). If b= +o we take 
(f(P)]n instead of /(P), choosing V and m so large that 


§ [7 (P)|n G (dF) >a +1. 
em 


By (133), we also have 
S UP) (a8) <a. 


fm Fy 


The previous argument leads us to a contradiction, and the theorem 
is proved. 

If the integral of a non-negative function f/(P) over % has a finite 
value, we say that /(P) is summable on @. It follows from this and 
the above definition that, if {(P) is summable, and the non-negative 
function 9(P) satisfies »(P) < f(P) on &, o(P) must also be summable. 
We now take a function measurable on @ that can vary in sign, and 
split it into positive and negative parts: 


f(P) = ft (P)—-f-(P). (137) 
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The function /(P) is said to be summable on @ if ft(P) and {-(P) 
are summable. The value of the integral is now ea by 


(a= if (P) G(d&) — a ce (138) 
& 


If only one of the functions f*(P) and f-(P) is summable, the 
integral of f(P) still has a meaning, as in [52], but its value will be 
(+-°°) or (—°°). For the most part, the set on which the integration 
is performed is the whole of the plane or the whole of a straight line 
or in general the whole of n-dimensional space. 

The theorem of [52] and properties 1, 2, 3, 4, 5, 6, 7, 9 and 10 
hold for integrals on a measurable set of infinite measure. We shall 
only prove the complete additivity and absolute continuity. The 
proof of the theorem and of the remaining properties is extremely 
simple. As a preliminary we must prove a simple lemma. 

Lemna. If the non-negative numbers a’ do not decrease as s increases 


and lim af = ay, on writing 


S+ a a) — > als), 


k=] 
we have 
lim a = Say. (139) 
S—r oo k=1 


We use reductio ad absurdum. Suppose that the sum written can 
have the value (+c). Writing a for the limit of a, we suppose 
first that 

a> > a. 
k=1 


For sufficiently large s, we have a > c, where c is the sum of 


series (139), and, having fixed such an s, we can fix so large an m 
that 


k=1 k=1 
so that all the more: 
ay > > %w 
k=l kool 
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We therefore have, for some fixed m: 
m 
>> ay >a. 


We can now choose so large an s that 


> al >a. 
k=l 

This finite sum is obviously < a, so that a® > a, which is 
absurd, since the sequence a“ tends to a without decreasing. The 
lemma is proved. 

We now show that the integral is completely additive. Let f(P) 
be summable on @ and let us divide this set into a finite or denumerable 
number of measurable sets 2, of finite or infinite measure. Now, 
f(P) will be summable on each &,. Suppose further that Z° c #) c 
c ... is an increasing sequence of sets of finite measure tending 
to &. We introduce the sets ZY = %,% of finite measure. They 
increase as s increases, lim %( = &, and % = $94 FO + FO + 


So 
+ ..., the sets on the right-hand side having no points in common 
with each other. We have for the sets ¥© of finite measure: 


{ f(P)G (ad) 


ets) 


 { £(P) G (dé). 


1 g(s) 


mM 8 


On assuming f(P) positive for the moment, passing to the limit 
in this formula as s— oo, and using the lemma, we obtain (20) 
of [49]. Our assertion holds in the general case on the basis of (137) 
and the fact that it holds separately for {t(P) and f-(P). 

Property 6 of [49] is proved similarly. Let us show that the integral 
is absolutely continuous. We take {(P) >0 and summable on @. 
Given « > 0, we choose m so large that 


| fa (dB) <=. (140) 
fem) 
We can write for any set e contained in 2: 
jie (dv) = ae {G(dé)+ Jf fe(dé). 
(¢—e™)e 


In view of the absolute continuity of the integral on the set # 
of finite measure, there exists an 7 > 0 such that the absolute value 
of the integral over e is not greater than ¢/2 for e¢€ # and 
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G(e) < yn. On taking (140) into account, we can assert that the same 
is true for the integral over (£ — %”)e, whence it follows that the 
absolute value of the integral over e is not greater than « ifec & 
and G(e) < y, which proves that the integral is absolutely continuous. 
Theorems 1, 2, 3, 4 of [54] are also easily extended to the case 
of a set & of infinite measure. We shall prove Theorem ] as an example. 
Let « be a given positive number. We choose so large an m that 
F(P)G(d@) <e. (141) 


g—e™) 
We now write an inequality for the integral of {(P) — f,(P): 
fff) 6 (a8) < 
é 


<|{G-mG0ds)|+) Jf hetasy). (142) 
Sa a 


gim) 

We use the inequality |f —f,|<2F on the set — %”, and 
obtain, by (141), 

f STNG) § |f-f,l@(&) < J 2FG (ds) <2. 


f—elm) e-em) e—eim) 


Theorem 1 is already proved for the sets #”) of finite measure, 
so that an N exists such that, with n > N, the first term on the 
right-hand side of (142) is < «. We thus have 


SU bey GUE) sete for n>N, 
& 


which proves the theorem, inasmuch as ¢ is arbitrary. The remaining 
theorems of [54] are proved similarly. 


64. The class LZ, on a set of infinite measure. The formation of 
the class LZ, and the theory of orthogonal functions may be carried 
over easily to the case of a set @ of infinite measure. We say that a 
function f{(P) on a set @ of infinite measure belongs to L, if it is 
measurable on @ and its square /?(P) or the square of its modulus 
| {(P) |? is summable on &. All the theorems of [55] remain in force, 
except for Theorem 1. The finite measure of @ was an essential factor 
in Theorem 1. An example can easily be given of a function of L, 
which is not summable. For instance, 1/x belongs to L, in the interval 
[1, ©], since 1/z? is summable, but 1/z is itself not summable. We also 
used the finite measure of % in the proof of Theorem 8. Let us show 
that the theorem remains true when the measure of @ is infinite. 
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For clarity, let & be the complete plane &,,. Let the sequence /,(P) 
(n = 1,2, ...) of functions belonging to LZ, on @,, be mutually con- 
vergent. Let 4, be the interval defined by —m <a2<m, —m< 
< y < m. The functions /,,(P) belong to L, and aremutually convergent 
on each A, since the integral of a non-negative function over 4m 
is not greater than the integral of the same function over the whole 
plane. It follows from Theorem 8 [56] that a subsequence Neale 
{2(P), ... can be extracted from the sequence /,(P) such that it is 
convergent almost everywhere on 4,. We can extract from this sub- 
sequence a new subsequence f®?, {, ..., which is convergent almost 
everywhere on A,, and so on. It is readily seen that the subsequence 
AYP), FOP), ..., is convergent almost everywhere on & [IV; 15]. 
Let f(P) be the limit function for this subsequence. Since the sequence 
fn(P) is mutually convergent on @,., for any given « > 0 there exists 
an N such that 


Se fr(P))?G (dF) <e, for nf? and a>N. 


If & tends to infinity, we obtain, as in [56]: 
§ [f(P) —f, (PPG (de) <e for n>WN, 
Foo 


which proves Theorem 8. 
If f(P) € L, on &,, and we are given e, > 0, there exists an NV 
such that 
O0< § fP(P)G(d®) < &. 
&a—Ay 
We define the function »(P) as follows: y(P) = /(P) in Ay and 
y(P) = 0 outside Ay. Obviously y(P) € L,(2..) and 


vite) USEC Ge) fF Gey sa 
Sa—Ay 
whence it is clear that the lineal of functions »(P), differing from 
zero only on some finite interval, is everywhere dense in £,(@ ..), i.e. 
in Z, on 4. 

Let us show that L,(@..) is separable. As shown above, there 
exists on 4,, (m= 1, 2, ...) a set of functions go, »(P) (k,m = 1, 
2, ...) everywhere dense in Z,. We thus obtain a denumerable set of 
functions 9 m(P) of L,(@..). It is easily seen that they are everywhere 
dense in L,(%..). For, let f(P) € £,(@.); given an « > 0, there exists 
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an m, such that 
f peasy <, 


Word 


and, by what has been said, a function Pu,m,(P) can be chosen from 
the above-mentioned denumerable set such that 


§ [F(P) — Gk, (P)]2 G (dB) < 
Ain, 


and hence, since ¢,m,(P) = 0 outside 4, 
§ [fF (P) — &%, mo (P)]? & ( (a8) < é*, 
feo 


which proves that L,(@..) is separable. 

We now show that the lineal of continuous functions y(P), vanishing 
outside some finite interval (different for different g(P)), is everywhere 
dense in L,(@..). 

This lineal is usually called the lineal of finite continuous functions. 

Let f(P) € L,(%..); given « > 0, we show that there exists a finite 
continuous function 9(P) such that || f — @ ||;,, < ¢. As we have seen 
above, there exists an m such that || f ||;,-4, < ¢/2. Having fixed 
this 4, we can say that there exists a function ¢(P), continuous 
in Am, such that || f — ¢ ||4, <¢/2. On writing M = a | p(P) |, 

P 


and given any h > 0, we put g(P) = 0 on the Houniary "of Am+h 
and continue ¢(P) ontothe whole of 4+» whilst retaining the continuity 
and without exceeding max | (P)| [IV; 157]. Outside 4,4, we put 
g(P) = 0, so that g(P) is a finite continuous function. On taking 
into account what has been said, we obtain 


lf— ele. =| f— ela. + uo =P Age = 4 2 I FIFA + 
+ 2||o|2.-4n < oe ane 2 tae en 


But || 9 need is equal to the eae of | p(P) |? over Anan — 4m- 

We have for the Lebesgue integral: 
| llFo-am = f |p(P)|?dady < M*(h? + 2mh), 
Amih—4in 

and we choose h such that M(h? + 2mh) < ¢?/8, after which we get 
I|f— 9 (ies < e*, which is what we had to prove. A similar statement 
holds for the Lebesgue-Stieltjes integral. 

Theorem 2 of [60] is proved as above. Notice that, for the Lebesgue 
integral, polynomials do not belong to L,(@..). If ¢ is any measurable 
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set, on prolonging a function of L,(@) on to &,, by zero, we get a 
function of Z,(@ .). On starting out from this, we can extend everything 
said above to the case of any unbounded measurable set. 

We can quote as an example of a closed orthogonal system on 
the interval (—°o, +0°°c) the Hermite functions [III,: 156]: 


Ln a” 
1 (2) = (— 1)¥e* “dak (e7*’) 


and on the interval (0, °°) the Laguerre functions [III,; 160]: 


1 
Yr (a) =e? x (xk o-*) . 

Both examples refer to the Lebesgue integral. 

A simple proof of the fact that these systems are closed may be 
found in Volume 1 of Hilbert—Courant: Methoden der Mathematischen 
Physik (Interscience, N. Y., 1931, 1937). 

What has been said about L,(%..) can be carried over at once to 
L,(@) (p > 1), as was the case for a bounded set, and to the case 
of complex functions. 


65. An integrating function of bounded variation, We have so far 
assumed, when investigating the Lebesgue—Stieltjes integral, that the 
function G(@) is non-negative. We turn next to the case when the 
integrating function G(@) is obtained from the function of an interval 
G(A), which is a function of bounded variation. We have the canonical 
form for such a function, as the difference between two non-negative 
functions: 


where 
G,(4)=+-[V (4)+@(4)];  @, (4) =+4-[¥ (4) —G(4)], 


and V(A) is the total variation of G(4) on the interval 4. Each of 
the functions G,(4) and G,(4) leads to a non-negative, additive and 
normal function G,(#) and G,(@) on the closed fields of sets Lg, 
and Lg,. Let us write Lg for the closed field of sets forming the 
common part of Lg, and Lg,. The complete additive and normal 
function 


G(%) = G, (8) — G,(@) 
is defined on this field. 
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We take the non-negative, additive and normal interval function 
V (A) = G, (4) + G, (4). 


Its extension leads to a function V(#), defined on the closed field 
Ly. On using the last formula and the fact that G;(4) is non-negative, 
it is easily shown that Ly is the common part of Lg, and Lg, ie. 
Ly coincides with Lg. We first have to show that, given any set @, 
its exterior measure with respect to functions V(4), ie. | F |y, is 
equal to the sum of its exterior measures with respect to G,(4) and 
G,(A), ie. |F ly =|F% |c, + | & |g,. It is then easily shown, by using 
the definition of measurability, that, if f is measurable with respect 
to V(A), is measurable with respect to G,(4) and G,(4); and also, 
conversely, if is measurable with respect to G,(4) and G,(4), it is 
measurable with respect to V(4). When integrating, we have to con- 
sider the class of functions f(P), measurable with respect to V(4), 
i.e. the class of functions measurable with respect to G,(4) and G,(A). 
The integral is naturally defined by 


J f(P)@ (d&) = Jf f (P) G, (d&) — § f (P) G, (de), 
g € & 


and its existence is guaranteed by the existence of the integrals on 
the right, on the assumption that both these integrals have finite 
values. Otherwise, the right-hand side may reduce to an indeterminate 
expression. Two functions are said to be equivalent if they are equiv- 
alent with respect to V(@). The properties of the integral 1, 4, 5, 7, 
8, 9 of [52] are retained without change. In property 3, instead of 
inequality (16) we have 


[J fF) @¢aBy) < f17(P)1F (a8. 
¢ ¢ 


In property 6, instead of the convergence of series (49), we must 
require the convergence of 


S Sif PV a8), 
k=l & 


and finally, in property 10, instead of inequality (50) we have 
ead < SF(P)V (dé). 
é é 
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By the definition of the integral, summable functions are functions 
summable with respect to V(@). Theorems 1 and 2 of [54] about 
passage to the limit are retained without change. 

The concept of integral is also easily extended to the case when 
G(A) is a complex function: 


G (A) = G' (A) + G" (A)?, 


where G’(A) and G’(A) are functions of bounded variation. On using 
the canonical forms of these functions: 


G' (4) = Gj (4) — Gz (4); G" (4) = G4 (A) — G* (A), 
we arrive at the formula 
G (A) = (Gj (4) — G,(4)) + (G1 (4) — Gz (4)) 4. 


The function G(4) leads to the function G(#@), defined on the closed 
field Lg, which is the common part of the closed fields Lg, and 
Le; (i = 1, 2). The definition of measurable functions with respect 
to G(@) and of the integral are essentially the same as above; an 
integrable function may be complex. 

In the case of one variable, we have the canonical form for a function 
of bounded variation: g(x) = g,(x) — g,(x), where the last two functions 
are non-decreasing; the integral is written as 


Sf (2) dg, (%) = Sf (x) dg, (%) — f f (x) dg, (x). 
Fs g é 


If we introduce the total variation v(x) = g,(x) + g,(x), we have 
the inequality 


 f (a) dg (2) < {| (x)| dv (x), 


and the summability of f(z) with respect to g,(z) and g,(z) is equi- 
valent to the summability of f(z) with respect to v(2). 


66. The reduction of multiple integrals. We turn now to a discussion. 
of the basic result of the theory of Lebesgue multiple integrals, 
concerning the reduction of a multiple integral to a sequence of 
simple quadratures. Let us recall the corresponding results of the 
earlier theory of multiple integrals [II; 97]. If, e.g. the function 
f(x, y) is continuous on the finite closed interval Jd(a<z<b;c< 
< y < d), the following formula holds for reducing the double integral 
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to two quadratures: 
bd db 
VS f(x,y) dady = \[ (f(x,y) dy] dx = [[f f(x,y) da] dy. 
4 aie c a 


We next state the analogous theorem for the Lebesgue integral. 
It was first proved by the Italian mathematician Fubini in 1907. 
FUBINI’S THEOREM. Let f(z, y) be a summable function on the finite 
interval A (a <u<b; c<y<d). Here, f(x,y) is summable with 
respect to y on the interval [c, d] for almost all values of x of [a, b], the 


function 
d 


h(x) = ff (a,y)dy, (143) 
defined almost everywhere in [a,b], ts summable over this interval, 
and we have the equation 


b d 
SS f(x,y) dxdy = {[ (f(x,y) dy] da. (144) 


Similar statements hold when the order of integration is changed. 
In this case we have 


d 6 
J Sf (x,y)dady = [[ (f(x,y) dx] dz. (145) 
4 ¢ a 


Notice that the integrals in the theorem are understood in the 
Lebesgue sense, and the summability of functions must naturally be 
also understood in this sense. The assertion of summability obviously 
includes the assertion that the function is measurable. It should be 
noticed that function (143) is defined almost everywhere, but not 
necessarily at every point, on [a, 6]. A similar remark applies for the 
function 


b 
L(y) = J f(a,y) da. (146) 


In order to clarify the proof, which is fairly difficult, we have 
stated Fubini’s theorem for a particular case. We shall indicate later 
the various more general statements of the theorem. The proof must 
be preceded by several lemmas. 

Lemna 1. If Fubint’s theorem holds for functions f,(x, y), fo(@, Y), -+ +s 
fm(x, y), summable on the interval A, it holds for any linear combination 
of these functions : 


f(x,y) = s CK fe (©, ¥) (147) 


k=1 
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Each of the f,(z, y) is summable by hypothesis with respect to y 
on [c, d], if we exclude from the interval [a, 6] of variation of 2 some 
set A, of measure zero. If we exclude from [a, b] the set A = A, + 
+ A,+ ...-+ A», which is also of measure zero, for the remaining 
values of x function (147) will be summable with respect to y on 
[c, d]. All the functions 

d 
hy (2) = J fx (zy) dy 


c 


will be defined on [a, b] except for points of set A. Further, (144) 
holds for the f,(z, y) by hypothesis. If we use the rule for integration 
of a sum and take the constant factor outside the integral, (144) will 
be seen to hold for function (147), and the lemma is proved. 

Note. If all we are given about the f,(z, y) is that they are 
measurable with respect to y on [c,d] for almost all values of x of 
[a, b], the same can evidently be said of function (147), since the 
sum of measurable functions is also measurable. It is naturally 
assumed here that the sum has a meaning [43]. 

Lemma 2. Let f,(x, y) be a monotonic sequence of summable functions 
on the interval A, convergent to f(x, y), summable on A. If Fubini’s 
theorem holds for each of the f,(x, y), tt holds for the limit function 
f(z, y). 

We shal] assume in the proof that /,(z, y) is a non-decreasing 
sequence. The case of a non-increasing sequence reduces to this case 
by replacing f,(z,y) by —/f,(z, y). By hypothesis, each f,(z, y) is 
measurable and summable with respect to y on [c, d], if we exclude 
from the interval [a, 6] of variation of 2 a set A, of measure zero. 
If we exclude from [a,b] the set A= A,+ A,+..., which also 
has measure zero, for the remaining set of x the limit function f(x, y) 
will also be measurable with respect to y on [c,d]. By hypothesis, 
each of the functions 


d 
hy, (2) = Lee (x,y) dy — (148) 


is defined on [a, b], if we exclude the set A, of x of measure zero. 
If we exclude from [a,b] the set A, which also has measure zero, 
all the functions (148) will be defined for the remaining gz, i.e. will 
be defined on the set [a,b] — A, and, by hypothesis, are summable 
on [a,b]. The sequence h,(z) is increasing, and we can define the 
limit function A(x) = lim h,(x), measurable almost everywhere on 


fi co 
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{a,b}. On recalling that Fubini’s theorem holds by hypothesis for 
the f,(x, y), and that the limit function f(z, y) is summable on A by 
hypothesis, we can write 


b 
fh, (x Na) Ink x,y) dady < f f(x,y) dxdy. 
a 4 
Hence, by Theorem 2 of [54], we can say that A(x) is summable 
on [a, bj, and we have the formula 


(h(a) ax = lim i h, (x) dz = lim SJ tal x, y) da dy. 


N-roo Q N—oo 


On the other hand, by Theorem 2 of [54], we have 
lim { ff, (x,y) dx dy = § { f(a, y) dx dy, 
4 A 


M—poo 


and we can therefore write 
b 
[ h(a) da = { (f(x,y) dedy. (149) 
a A 
We have defined h(x) as follows: 


h(x) = lim h, (x) = lim VA x, y) dy. 
Noo MI->>0 € 
To complete the proof of the lemma, we still have to show that 
f(z, y) is summable with respect to y on [c,d] for almost all a of 
[a, b], and that A(x) can be expressed almost everywhere on [a, b] by 


d 
h(x) = § f(x,y) dy. (150) 


After proving this, we obtain Fubini’s theorem in full for f(z, y), 
by (149). Let B be a set of points of [a,b] at which A(z) is defined 
and equal to (+°°). Since h(x) is summable, B is of measure zero. 
If we exclude from [a,b] the set A+ B of measure zero, on the 
remaining set, i.e. almost everywhere on (a, 6], the increasing sequence 
h,(x) tends to A(x), which takes finite values, i.e. for every 7 of the 
set (a, b) —(A + B), the integrals over [c,d] of the non-decreasing 
sequence of functions f,(z, y) of y are bounded by the number A(z). 
By Theorem 2 of [54], for these values of z, f(x, y) is summable with 
respect to y on [c, d] and we have 


d d 
VF (x, y) dy = lim i Vi (x,y) dy, 


¢ N-roo € 
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and, by (148), A(z) is given by (150) almost everywhere on [a, 6]. 
The lemma is therefore proved. 

Nofe. It follows immediately right from the start of the above 
proof that, if we are only given that the /,(z, y) are measurable 
with respect to y on [c, d] for almost all x of {a, b], the limit function 
f(x, y) is measurable with respect to y for almost all x of [a, bd]. 


67. The case of the characteristic function. The aim of the present 
section is to prove Fubini’s theorem for the case when the integrand 
is the characteristic function of some measurable set % belonging 
to the interval A, referred to in the theorem. The integral of w,(P) = 
= @;(z, y) obviously gives the measure m(@) of the set Z as a set 
on the plane. Let %,, be the set of the points of f which have a given 
abscissa Zp, i.e. Jy, is the intersection of & with the straight line 
x = 2. The characteristic function of this set is equal to w,(Xq, y). 
The measurability of %,, with respect to y is equivalent to the 
measurability of w,(%,, y) on the interval [c, d], and if this is the case, 
the linear measure of %,,, which we denote by m’(Z, x,)> ig equal to 
the integral of w,(Xo, y) over the interval in question. The summability 
of w,(2, y) is guaranteed by virtue of its being bounded. Fubini’s 
theorem for the characteristic function w,(z, y) thus reduces to the 
following: w,(z, y) is measurable with respect to y on the interval 
[c, d] for almost all x of [a, b], the bounded function 


d 
fo (x,y) dy, 


€ 


~~ 
& 
3. 
& 

I 


is measurable in [a, bj, and 


b 


d 
m(&) = { [| w (x,y) dy] da. (151) 


More briefly, #, is measurable with respect to y for almost all z, 
and 


m (8) = \m' (&,) dx. (152) 


We shall prove Fubini’s theorem for the characteristic function in 
stages. 

LemMMA 3. Fubini’s theorem holds for the characteristic function of 
any semi-open interval, open set and a set G; belonging to A. 
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If A’[a<az< BB; y <y < 46] is a semi-open interval belonging to 
A, wy (xz, y) is measurable with respect to y for any 2: 


d 
h(x) = Soy (z,y)dy=6—y, 


ifa<a<f, and h(z)=0 if xis outside J’, 


and the lemma is obvious, since the measure of 4’ is equal to (f — a) 
(6 — y). The open set @, is the sum of a denumerable number of 
semi-open non-overlopping intervals 4, and 


ws, (8,y) = >! way (2, 9)- 


k=0 


By lemma 1, Fubini’s theorem holds for the finite sum 
m 
Pea Way (x, y) . 
k=l 


As m increases, these sums form a non-decreasing sequence, which 
tends to a bounded, and consequently summable, function w,"(z, y), 
and, by lemma 2, Fubini’s theorem also holds for w, (x, y). Suppose, 
finally, that 2% is a set G, belonging to the open interval 4. We can 
write it in the form 


= [7 (153) 
k=1 


where O;, are open sets belonging to the open interval A. Notice 
that, if certain O, did not belong to the open interval 4, we should 
be able to replace O, by the product of O, with the open interval 4. 
By (153), @,-(z, y) is the limit of a non-increasing sequence of 
characteristic functions w, (x, y) of the open sets 


mm 
Em= [J O% 
k=l 
and since Fubini’s theorem is already proved for w, (x, y), it holds 
for w, (x,y), by lemma 2. If certain of the points of the set @6 of 
type G; lie on the contour of 4, we somewhat widen A so that % 
lies inside the widened interval 4. Fubini’s theorem holds for w,, (x, y) 
on Ay. Hence, in view of the fact that w,,(z, y) = 0 outside 4, we at 
once obtain Fubini’s theorem for w,,(z, y) in the interval 4. Notice 
that, in all the cases discussed in this lemma, w,(z, y) is measurable 
with respect to y for all z. 
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Lemma 4. If 2 is a set belonging to A and having a plane measure zero, 
the linear measure of @, is zero for almost all x of [a,b], and Fubini’s 
theorem holds for w,(z, y). 

We form the set &9 of type G; belonging to 4, covering # and such 
that m(f3 — &) = 0 [40]. We have %5 = & + (&4 — BG), and since 
m(f) = 0 and m(%4 — &) = 0, we have m(#5) = 0. Fubini’s theorem 
holds for @, and we can write 


bo od 
S [J me; (x, y) dy] da = 0. 


The quantity in square brackets is non-negative, and, by property 
14 of [49], we have almost everywhere on the interval a < az <b: 


d 
J we (x,y) dy = 0. 


It is clear from this that the linear measure of the set of points 
of &%, lying on almost all straight lines parallel to the Y axis, is zero. 
Since # c &}, this holds all the more for the set @, i.e. at almost 


all x of [a, 6], 
d 
J wg (x,y) dy = 0, 


¢ 


so that Fubini’s theorem holds for o,(z, y): 
b d 
m(%) = 0 = | [ {wz (x,y) dy] de. 


Lemma 5. Fubini’s theorem holds for w, (x, y) of any measurable set 
@ belonging to A. 

We form the set %o of type G, belonging to 4, covering # and 
such that m(Z%j — 2) = 0. By lemmas 3 and 4, Fubini’s theorem 
holds for the characteristic functions of the sets Jj and (@3 ~ 2). But 
Os(Z, Y) = We, (Z, Y) + z,_2(%, y), and by lemma 1, Fubini’s theorem 
holds for w,(z, y). 

Notice that, if the measurable unbounded set % has finite measure, 
the &, are measurable for almost all x, and (152) holds. This follows 
at once by a passage to the limit from the bounded sets. It is easily 
shown in the same way that, if Z is simply measurable, %,,is measurable 
for almost all x. If, in addition, m(%,) is summable, 2 has finite 
measure, and (152) holds. Lemma 4 obviously also holds for unbounded 
sets. 
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68, Fubini’s theorem. We still need a further simple lemma for 
the complete proof of Fubini’s theorem. 

Lemma 6. Fubini’s theorem holds for a measurable function f(x, y) 
that takes a finite number of finite values in A. 

Let f(z, y) = cy (k = 1,2, ...,m) if the point (x,y) belongs to 
the set %,, where d= ,+ %,+ ...+4,,. We can write f(x,y) 
as a linear combination of the characteristic functions of the sets @,,: 


f (x, y) _ S cs, (2, Y), 


k=l 


and it follows from lemmas 1 and 5 that Fubini’s theorem holds 
for f(x, y)- 

Fubini’s theorem can be proved very simply on the basis of the last 
lemma. Let f(z, y) be summable on JA. We split it into positive and 
negative parts: /(x, y) = f* (x, y) — f-(x, y). By lemma 1, it is sufficient 
to prove the theorem for f+ and f-, i.e. we can assume in the proof 
that the summable function f(z, y) is non-negative. As we know 
from [46], such a function can be written as the limiting function of 
a non-decreasing sequence of measurable non-negative functions 
fn(x, y) with a finite number of values. By lemma 6, Fubini’s theorem 
holds for the f,(z, y), so that, by lemma 2, it also holds for f(z, y), 
and the theorem is proved. 

Notice that we assume the f(z, y) in the theorem to be summable 
on the interval 4. Given this condition, the quadratures on the right- 
hand sides of (144) and (145) have a meaning by virtue of the theorem, 
and give the double integral of f(z, y) over 4. The converse conclusion, 
that the double integral exists if the quadratures on the right-hand 
sides have a meaning, may be false. Examples may be quoted in 
which the iterated integrals on the right-hand sides of (144) and 
(145) have a meaning, and the results are equal to each other, yet 
}(z, y) is not measurable on A, or is measurable but not summable. 
But if f(z, y) is non-negative on A, the converse holds, and we have 
the following theorem. 

THEOREM. If f(x, y) ts measurable and non-negative on the interval 
A, the existence of the iterated integral on the right-hand side of (144) 
implies that f(x, y) is summable on A, so that Fubini’s theorem holds 
for tt. 

Suppose that the iterated integral on the right of (144) has a 
meaning, i.e. function (143), summable on [a, b], exists almost every- 
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where as regards x on [a, b]. We introduce the functions 


_ [f(x y), if f(a, y) <n 
enh = ‘i if f(x,y) > xn. 


They are bounded and measurable, and form a non-decreasing 
sequence which tends to /(x, y). They are obviously summable on 4, 
and Fubini’s theorem holds for them. We can write 


b a 
SS [fe nda dy = {SL (x, y)]n dy] de, 
A aoe 
but [/(z, y)]ln < f(z, y), so that 


SS[f@ y]ndady < fh (x) da, 


whence it follows [50] that /(z, y) is summable on J. 

CoroLuary 1. If f(z, y) changes sign, but the right-hand side of 
(144) exists for | f(z, y) |, then | f(z, y) | is summable by virtue of 
the theorem, so that /(z, y) is also summable on 4, and Fubini’s 
theorem is applicable to it. 

CoroLLary 2. If f(z, y) is measurable on A and summable with 
respect to y for almost all 2, the A(x) defined by (143) is a measurable 
function. As usual, we can assume f(z, y) non-negative. We have 
Fubini’s theorem for [f(z, y)], (which is bounded), and 


d 
h,, (x) = \[f (x, y)]ndy 


is measurable. On letting n tend to infinity, we see that the limit 
function A(x) is measurable. 

We must note some simple generalizations of the statement of 
Fubini’s theorem. If f(z, y) is summable on the measurable bounded 
set 3, we have 


SS f(x,y) dady = f[{ f(x,y) dy] da = f[{f(a,y)da] dy, (154) 
é Bz &z By & 


where @, is the set of points of # having a given abscissa x, and 2, 
is the analogous set for y, whilst B, and B, are the projections of 
® on the X and Y axes. The integrals over %,, and &, may not have 
a meaning for values of x and y forming a set of measure zero. To prove 
(154), it is sufficient to cover & by a finite interval 4 and construct 
a function f,(z, y), equal to f(x, y) at points of 8 and to zero at all 
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points of A not belonging to . We now show that Fubini’s theorem 
can be extended to the case of unbounded sets. Let us take the entire 
plane as an example. It is sufficient to consider non-negative functions. 
Thus, let f(x, y) be measurable, non-negative and summable over the 
plane, i.e. the double integral exists: 


“00 
A=((f(x,y)dady. (155) 


The function /(z,y) will be summable on any finite interval 
Amnl—m <2<m; —n <y <n]. Fubini’s theorem holds on this 
interval, i.e. 

+m+n 
SV f(x,y) dady=f[ f(x,y) dy] da. 
Ann —m—n 
On the other hand, since /(z, y) is non-negative we have 
SS f(x,y) dady < A, 


Ann 


so that 
+m+n 
S [Sf (x,y) dy] da < A. 
—mM—n 


If n increases indefinitely and we use Theorem 4 of [54], we get 


+m+o 
S [Sf (ey) dy] de < A. 


—M-—~oo 


If we now let m increase and use the definition of the integral 
over an infinite straight line, we arrive at the inequality 


+0040 
SSF (x,y) dy] da < A. (156) 
Finally, we show that the < sign cannot hold. If it does, there 
must exist a positive a such that 


SUS (x,y) dyjdx< A—a 


—o9o—oc0 


and all the more 
+ntn 
{ff (x,y) dady = [ff (x,y) dy]dz < A—a, 
Ann —n—n 


which is absurd, since the integral over 4,, must tend to integral 
(155) on indefinite increase of n. We must therefore have the = sign 
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in (156), and, on comparing with (155), we in fact obtain for the whole: 
plane the formula that appears in Fubini’s theorem. Jt obviously 
follows from our proof that the iterated integral appearing in this 
formula exists. 

Fubini’s theorem can also be stated for integrals of any multiplicity. 
The result is as follows. Let 44, be an interval in space Rm4n, 
having (m +- n) dimensions, defined by the inequalities 


A, <4, <0; A, <4, <Oy3.- 5} Omtn < Lmtn < Omin, 


whilst 4,, and ae are intervals in spaces R,, and R, defined by 


Apt aie Oy) SOF Ops Sy Sy 2G, SW OS 


A, >Oam+, < Um+1 <S Omi +25 Omtn Semin S mtn: 


Further, let {(P) be a function summable over An4n. If we fix a 
point P,(x}, 22, .--,2m) of Mm, f(P) will be a summable function in 
A, for any choice of Py, except possibly for a set of points Py which. 
has measure zero in #,,. The integral of {(P) over 4,: 


Wh (4, Wy .. +» Bm) = [fF (P) timer ++. Wlmnen 


gives a summable function in 4,,, and the formula holds: 


SFP) dz, dz, 26 Olin = = {lpr ) dem, dtim+s oie damn] x 
oe, x da,dax,...dx_,- (157) 


Fubini’s theorem also admits of simple generalization to the case 
of Lebesgue-Stieltjes integrals. Suppose we have two increasing and 
bounded functions g(x) and k(y). By using these functions, we can 
define measures G(4) and K(A) of semi-open intervals and then extend 
the functions in question to the closed fields Zg and Ly. We thus 
obtain additive, non-negative and normal functions G(%) and K(@) 
on Lg and Lx. Similarly, by starting from the function g(x) k(y), 
defined on the plane, we can form an additive, non-negative and 
normal function M(@) on some closed field Ly, of sets on the plane. 
If f(P) = f(z, y) is measurable with respect to M(@) and summable 
over some interval 4 of the plane, this function is summable with 
respect to y over the interval 4, of the Y axis, corresponding to the. 
interval A of the plane, with respect to the function K(@): 


= ff (x,y) K (dé) = jie nde), 
Ay 
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if we exclude from the interval 4, of the X axis, corresponding to 
the interval A of the plane, some set of values having measure zero 
with respect to G(@). The function f(z) is summable on 4, with respect 
to G(&), and the formula holds: 


ff FCP) BE (ak) = § ff (ey) dd [g (x) (y)] = 
4 4 
=f [SF (x, y) dk (y)] dg (2). 


Az Ay 

Another similar formula is obtained by changing the order of 
integration. The proof of this generalization of Fubini’s theorem 
is precisely the same as that of the basic theorem, except that the 
Lebesgue integral must be replaced everywhere by Lebesgue-Stieltjes 
integrals, and measurability in the Lebesgue sense by measurability 
with respect to the functions G(@), K(#) and M(@). 

Note. If we are only given that the function f(z, y) is measurable 
on an interval 4 of the plane, it follows from this that it is measurable 
with respect to y on [c, d] for almost all z of [a, b], and is measurable 
with respect to x on [a,b] for almost all y of [c,d]. This remark 
follows at once from the remark that we made after the proofs of 
Lemmas I and 2, and from the later proof of Fubini’s theorem. 


69. Change of the order of integration. Another theorem may be mentioned, 
on changing the order of integration. 

THEOREM. Let the function g(x, t) be summable with respect to t over the interval 
[c,d] for all x of the interval [a,b], and be of bounded variation with respect to 
x on this interval for all t of [c,d], except possibly fora set of t of Lebesgue mea- 
sure zero. Further, let the total variation of g(x, t) with respect to x on [a, b] for 
all the t in question not exceed some non-negative function F(t), measurable on 
[c,d], for which the integral existe: 


d 
§ F(t) de. (158) 
c 

Now, the function 


d 
$g (a, t) di (159) 


c 


ts of bounded variation in x of [a, b], and we have, for any function f(x), continuous 
in [a, 6]: 


d b b da 
S (SF (x) dg (x, t)] dt = ff (w) dy [Sg (x, 4) de], (160) 
c a a c 


the integrals with respect to t being Lebesgue integrals. 
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To prove that (159) is a function of bounded variation, we divide [a, 6] 
by @=2%y<4<%<... <a,_4<2%,=b6 and form the sum ¢, [8] for 
this subdivision. We get 


d 
$ [9 ( (@44) — 9 (Xr, t)] dt 


c 


n d d n 
ty = Ps ES g (2g, t) dt — \ i] (Lp—1s t) di | = a 
= €c 


k=1 


whence 
dn 


tas J 219 emt) — 9 (4-18) dt. 
But, by hypothesis, 
fl 
219 (tet) 9 (tev 1S FW), 


so that 
d 


ty < | F (t) de 
c 
whence it follows that (159) is a function of bounded variation. 
We write the obvious equation: 


n 


d 
i) > f (Eq) [9 (te t) — 9 (ayy, t)] dt = 


ec k=1 


n d 
= 2 te) IE Jo (ay, t) dt — Sate. 1 t) dé}, (161) 


where é, is a point of [z,_,, 2]. On indefinite subdivision, the right-hand side 
of (161) tends to the integral on the right of (160). Since f(z) is continuous in 
[a, bj, we have | f(x) | < LZ, where Z is a positive number. We have for the 
integrand of the integral on the left-hand side of (161): 


a 
2,! (4) [9 (tq, #)— 9 (q-15 Dy} t)—g (@,-1,t)]| < DF (t). 


On applying Theorem 1 of [54], we see that we can pass to the limit on 
indefinite subdivision under the integral sign in the integral on the left-hand 
side of (161), where the integrand in question gives in the limit the Stieltjes 
integral: 


b 
ff (x) dyg (a, t). 
a 


Finally, passage to the limit in (161) leads to (160). This theorem admits 
of some elementary generalizations. For instance, we can assume [a, 6] infinite 
and f(x) continuous inside this interval and bounded. The Lebesgue integral 
with respect to ¢ can be replaced by the Lebesgue-Stieltjes integral. The original 
Stieltjes integral with respect to g(x, t) can be replaced by a general Stieltjes 
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integral and f(z) can be assumed merely bounded on [a, b). In this case, the 
existence of the integral on the right of (160) implies the existence of the 
integral on the left-hand side, and the equality of these two integrals. 


70. Continuity in the mean. We return to L, (p > 1) and show that 
every function of L, is continuous, if we take account of the increment 
in its norm in Ly. We take the case of a bounded measurable set @. 
We take the integrals in the Lebesgue sense and consider the plane 
case for definiteness. 

TueoreEm. If f(x, y) € Ly on a bounded measurable set F, given any 
e > 0, there exists an y > 0 such that 


WF(ethy +h) — fay) |P=(if(e@thy+k)—fl(a,y) |? dedy< 
é 
<e?, if |A| and |k| <7. (162) 


The point (z + h, y + k) may no longer belong to @,so we continue 
f(z, y) outside by zero. The norm subscript indicates the set with 
respect to which the norm is taken. We can include @ in a finite 
closed interval 4, (a, <2 <b; a, < y < 6). By Theorem 1 of [60], 
there exists a ¢(x, y), continuous in A, such that || f — @ |l4, < é/4. 
We can continue 9(z, y) on to a wider interval whilst preserving its 
continuity, say onto 4, (a, —-l<a<b,+1,a,—-l<y<b+ 1). 
We write f(z + h,y + k) — f(z, y) as 


ficthy+k)—fay=fethytkh)—piathythe+ 
+p(e-+h,y+k)—ep(z,y) + 9(&, y) —f (x,y). 
We have 
|F(a+thy +k) —fiay) ila <|if@+hy+k)—elethyt 
+k) |la+ lle (a+ hey +k) — 9 (2,y) |la + Ile (ey) — f(y) | 


4° 


We had || o(x, y) — f(a, y) |l4, < 6/4. In view of the uniform 
continuity of 9(2, y) in 4,, there exists an 7, > 0 (we assume 7, < 1) 
such that || p(x + h, y + k) — (x, y) |la, < e/4 for |h| and |k| < 
< n,, so that 


fe they +k) — f(xy) Ila, < 
<> 4+ |lf@+hyth—plethy +h (163) 
for |A|, [kl <m. 
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We have 
IF(ethy +k) —p(eth,y tk) ||, = 
=Slf@+hy th) —p@thy+k)|Pdedy, 
or . 
|f(e+hy+k)—p(athyt kl, = S| f(x,y) — 9 (x,y) P dady, 


4, (A, k) 


where 4,(A, k) is the interval obtained from A, by parallel displace- 
ment along the vector (h,k). On taking say h and k > 0, we get 


| f(athytk)—e(xthy+k)||f,= 
=|If (x,y) — v (a, y) Be + Sp (ey)? dady + fy (x,y) P dady, 


a +hSx<b +h, b<xS+h 
&sSyKSb+k a+keysht+k 
where 4’ is part of Jy, so that || f(z, ¥) — 9(2, y) ||4, < ¢/4. The last 
two integrals obviously tend to zero as h and k-+0, so that there exists 
an 7, > 0 such that 


é 


F(e+hy t+ k)—ylet+hyt kyl <-3- for |h| and |k| <n, 


where 7 = min (1, 7). We obtain from (168): || f(z +h, y + k) — 
— f(z, y) \l4,< €, for |h|and|k| < y, and so all the more || f(x + h, 
y +k) — f(x,y) |le < ¢ for |h| and |k| < n, which is what we set 
out to prove. 

The theorem can also be proved for the case of an unbounded 
measurable set. 


71. Mean functions. An averaging process can be introduced for 
any summable function f(P). It leads us to a sequence of functions 
which have derivatives of all orders and tend to f(P) in the usual 
sense. We shall take for definiteness the case of a plane and use the 
coordinates of points instead of the points themselves. Let w,(P, Q) = 
= w,(z, y; €, 7) be a function depending only on the distance 


r=|PQ|=V(@—&? + y— 0), 


equal to zero for 7 > 1, continuous and having continuous derivatives 
of all orders with respect to all four coordinates. Notice here that 
differentiation of w,(P,Q) with respect to x and y can be replaced 
by differentiation with respect to and 7 when the sign is changed 
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in the result. Suppose further that 
fo, («, y; £9) da dy = 1. (164) 


We do not write the domain of integration, on the assumption 
that the integral is taken over the whole plane. By what has been 
said, the integration is in fact performed over the circle (x — é)? + 
+ (y — n)* < 1. If the integration with respect to (x, y) in (164) is 
replaced by integration with respect to (&, 7), the result will be 
the same. 

We now introduce the following notation: 


: es coy. F 4 
wy (2, 93 &) =o,(=, ¥;=,2)@>0). (165) 


The function w, has the same differential properties as ,, but 
®, = 0 for r > o, and 


Jo, (,y; &) da dy = @?. (166) 
We shall indicate later a possible choice for w,(P,Q) [cf. IV; 157]. 


We shall in future use C,(¢, 7) to denote the circle with centre (&, 7) 
and radius g. We further introduce the notation: 


fo, (x,y; &)|dady =C, (167) 
whence 
{| eo, (x, 93 &, ) |dady = Co?. (168) 


Suppose we have the summable function f(z, y) in the bounded 
open or closed domain D, (instead of a domain we could take any 
bounded measurable set). We continue it by zero on to the whole of 
the plane and form the mean function from it: 


fa (E.0) = ge | FY) %g (ys Bm) dx dy. (169) 


The positive number g is usually called the averaging radius. 

The integrand is zero outside the circle C,(&, 7), and, if the distance 
d from the point (&, ) to D, is greater than zero, we have f,(£, n) = 0 
for 9 <d. 

THEOREM 1. The function f,(é, n) is continuous and has continuous 
partial derivatives of any order throughout the plane. 

Since w, depends only on the differences x — and y — 7, we have 


< eff ey) || (@ — hey — bE.) — og (ey; E,n)|dady. (170) 
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It follows from the uniform continuity of w, as a function of 2, y, 
&, n that, given any e > 0, there exists an 7 > 0 such that 


|. (% — hry — kh; &,n) — w, (a, y3 &,)| <e for |h! and |k| <7 
and consequently, 
Ife (E + hy +) — fe (8m)| <r] [f (wy) [de dy (|| and |k| <7) 


whence, since « is arbitrary, it follows that f,(&, 7) is continuous. 
We now prove the existence and continuity of the derivative with 
respect to €: 

Erb — fo (Sn) __ 


= ef fle,y) PERM BED Cole ED dedy. (171) 


By the mean value theorem: 


29 (@ — hy y3 60) — me (@ 36m) = 
h = — Br (x — Oh, y; €,n) (0 <9< 1). 


The absolute value of the right-hand side does not exceed some 
number K for any h, and the integrand in (171) has an absolute value 
not exceeding the summable function K | f(z, y) |, so that we can 
pass to the limit under the integral sign, and obtain 


Ofo (&, 1 a) »¥3és 
eC — — ff (x, y) BO de dy = 
=a (fey oe EET) de dy. 


The continuity of this partial derivative can be proved in precisely 
the same way as the continuity of f,(§, 1). The above proof is also 
applicable to the further derivatives, and they are obtained by 
differentiation under the integral sign: 


Ok fo (€, 7) 2 (x, y; §, 0) 
DEP anf =a lfey — SR ont da dy. (172) 


THEOREM 2. /f f(z, y) € £,(D,), ie. L, on Do, then 
SI fp (x, y) Pdady <C, {| f(x,y) Pdady, (173) 
where C, is a constant, and 


If — foil? = lim Slfly) —f,(@,y) Pdady=0. (174) 
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We have 
] ———SS\Y25 ———— 
If, (&n)| < or | (1,9 E,n)| Vj o, (2,4, 60) ||f (2% y)|dady. 
We apply Buniakowski’s inequality: 
1 1 
fa(ém) P< alll o, (x,y; £9) | dx dy-— [lf @, y)|? | op(x,4; £,9) |dx dy. 
(175) 
On using (168), integrating with respect to (&, 7) and changing 


the order of integration [69], we obtain 


[fe (+m) 2 48 dy <0 [| f(x,y) [=f lm_(w.ys &n) [dE dn] dedy, 


whence, by (168), we obtain (173) (C,=C?), where the integration is 
carried out on the right over Dg, since f(x, y) = 0 outside Dy. 

We turn to the proof of (174). On taking (166) and (169) into 
account, we can write 


If — fell? = S14 (E 0) — fy (En) Pde dy = 
=[ [se [EF Gm —F0e,9)] % (2,95 8.) dedy|? dédy. (176) 


We replace (z, y) in the inner integral by new variables of integration 
(uw, v), in accordance with z= + u; y= 7+ v, and on observing 
that (6+ u, 7+; & n)=0 for uw2+ v2 > o?, we obtain by 
applying Buniakowski’s inequality: 


Lf [F(E,) — f(x, y)] @, (a, y; 9) da dy |? < 
< f |f(n) —f(E+ un +0) Pdudo x 


ut+vt<et 
x i} we (E+ u,n + v; &,n) dudv. 
ut+i<g? 


But co, < C, (QO, is a constant, independent of 9), so that the right- 
hand side does not exceed 


Cong? § |f (én) —f(E+ 4,9 + 0) Pdudv. 


ut+-v'<gt 


As a result, (176) gives 


ffl? <2ef[ fem —£e +04) Pdude] dean. (177) 


oe? 
uty v®¥Se* 
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We change the order of integration: 
reno) 
lf—fell? < a a [fire En) —f(t+untor)/ dé dn |dudo. (178) 
ui v'<o? 


In view of the continuity in the mean, given any « > 0 there 
exists an 7 > 0 such that 


fF (0) —f(E+ 4,9 + 0) Pdédn <e, if w+ or<n?. 
Inequality (178) now gives 


ffl? <2 [ [ dudv=C,ate for 0<e<m, 
ur+e*<et 

which leads to (174), since e« is arbitrary. 

We shall prove a further theorem, which will be needed later. 

THrorem 3. Let U be a set of functions f(x, y) of L.(Dy), bounded 
in norm by the same number. Given a fixed 0, the corresponding functions 
f(z, y) are now bounded in modulus by the same number and are aqui- 
continuous. 

By hypothesis, there exists a positive number m such that, for all 
functions f(z, y) of U: 


S| F(z, y) Pdady <m. (179) 
D 


On applying Buniakowski’s inequality to the right-hand side of 
(169), we get 


[fo (&. 9) |? <= [0 (u, y; &, y) dady. fie x,y 


It remains to prove the equi-continuity. We apply ee 8 
inequality to the right-hand side of (170) and make use of (179): 


< rf log(e— hey — ke; E,) — w, (x,y; & 0) |? dady, 


and it follows from the uniform continuity of , that the /,(&, 7) 
are equi-continuous for any choice of f(z, y) of U. 

The above proofs are applicable for L, with p> 1. If p > 1, we 
have to use the Hélder instead of the Buniakowski inequality, and 


the formula 
1 


2 es 
) og (@, yi Es )| = | my (ys & m)[? |g (w, ys & md” (| + =1)- 
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When p = 1, the proof is as above, without the use of the Bunia- 
kowski and Hoélder inequalities. The following theorem therefore 
holds. 

TaeorEm 2’. If f(z, y) € £,(Do) (p > 1), then 


Vl fp (a, y) |Pdady < Cy {| f(x,y) |P da dy; (180) 


lim {| f(x, 9) — fy (wy) |? da dy = 0. (181) 


Theorem 3 also holds, with L, replaced by Ly. The proof also 
remains in force when D, is an unbounded measurable set, e.g. the 
entire plane &.,. By using Theorem 3, the lineal of continuous finite 
functions with continuous derivatives of all orders is easily shown to 
be everywhere dense in L,(&%..). 

A further point: everything said above holds both for the real and 
the complex space L,. We have discussed mean functions on the 
assumption that f(z, y) € Lp (p > 1). We now suppose that f(z, y) is 
continuous in the open domain D, and let D’ be any fixed closed 
domain lying inside D,. By virtue of the uniform continuity of f(x, y) 
in any closed domain lying inside D,, given an « > 0, there exists 
an 7 >O such that | f(z, y) — f(§, n) | <e if (2, y) € C,(E, ) for 
o < 7» and (, 7) is any point of D’. Since 


lf (E.0) — f, (0) | < -anAG n) — f(x, y)| |m, (a, ys &, 0) |dady (182) 
we have 


1f(E,7) —f,(é,n)|<eC for e@<m and (é,)€D’. 


If, in addition, f(z, y) has continuous derivatives up to any order 
in Dy, on replacing differentiation with respect to and 1 by differen- 
tiation with respect to x and y in (172), carrying out the integration 
by parts and using the properties of w,, we obtain for (£, y) € D’ 
and sufficiently small 9: 


Of (E,n) 1 OFF (x, y) ; 
“Oepant oJ daPoy? Ve (x, y; &,n) dx dy, (183) 


i.e. the derivative of the mean function is equal to the mean function of 
the derivative for (&, n) € D’ and sufficiently small o. 
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Notice that, given the conditions mentioned, @,(z, y; & 7) is 
equal to zero close to the boundary of D, and on account of this, the 
line integrals vanish when integrating by parts. We can choose for 
the paths of integration in these integrals sufficiently smooth curves 
lying sufficiently close to the boundary of D. 

It follows from what has been said that: 

THEOREM 4. If f(x, y) ts continuou along with its derivatives up to 
some order | inside Dy, in any closed domain D’ lying inside D, the 
mean function and its derivatives up toorder | tend uniformly to f(x, y) 
and the corresponding derivatives as 9—> 0. 

It may be remarked further that, if fx, y) is bounded:] f(z, y) | < _m, 
then 


[fe(E m1 <r [IF (ey) [Joye 9 & n|dedy<Cm. (184) 


THEOREM 5. If F(x, y) is summable over any closed domain D’ lying 
inside D,, and has the property that 


| F (x, y) (x, y) dx dy = 0 (185) 
D, 


for any choice of a continuous (2, y) with continuous derivatives up 
to some order | inside D, and vanishing outside some closed domain 
lying inside Dy, then F is equivalent to zero. 

It is sufficient to show that the hypotheses imply that (185) holds 
for any bounded measurable finite g(x, y), i.e. equal to zero outside 
some closed domain lying inside D, (as in the hypotheses, this domain 
may be different for different g(x, y)). After this, the proof that 
F is equivalent to zero is precisely the same as the proof of Theorem 
12 of [52]. Let o(z, y) be such a function, where | g(x, y) | < _m and 
P(x, y) is the mean function, so that | p(x, y) | < Cm. The function 
P(X, y) is finite for sufficiently small @ and has continuous derivatives 
of all orders, so that, by hypothesis, 


fF (2, y) (x, y) dx dy == 0. (186) 
D, 


The functions 9,(z, y) are convergent as g—0 in L, to 9(z, y) 
on D,, and a sequence g, (z, y) exists, which tends to 9(x, y) almost 
everywhere. In addition, | F(x, y) p(x, y) | <Cn| F(x, y) |, where 
F(z, y) 9 (x, y) are finite and we have a summable function on the 
right. On passing to the limit in (186) with @ = @,, we get (185), 
and the theorem is proved. 
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We must now mention one of the possibilities for choosing the 
function w,(z, y; &, ); in fact, we put [ef. IV; 157]: 


0,6 for r<], 
oO, (2%,Y; & 9) = (187) 
0 for r>1, 
(7? = (w — &)? + (y—1n)?), 


where the constant C, is chosen so that condition (164) is fulfilled. 
The presence of continuous derivatives of all orders for r <1 and 
r > 1 is obvious. As r—1 from smaller values, e”’—1)-> 0. It is 
easily shown by induction that the derivative of any order has the 
form, with r < IL: 

r2 1 


gn efi Pim(z@—&,y—n) Fai 


elon™ (2 — han o's 


where p; m(u, ¥) is a polynomial. As the point (¢, 7) approaches the 
circumference r = 1, this expression tends to zero. On using the 
finite increments formula, we find that the derivative exists at any 
point of this circumference and is equal to zero. In our example 
(187), @, > 0 for r < 0, and the constant C of (167) is equal to unity. 


CHAPTER III 


SET FUNCTIONS. ABSOLUTE 
CONTINUITY. GENERALIZATION 
OF THE INTEGRAL 


72. Additive set functions. Let f(P) be a point function measurable 
with respect to a non-negative, additive and normal function G(@). 
We form the indefinite integral 


p(@) = J f(P)G (de). (1) 


It is defined for all the sets 2, belonging to the closed field Lg, 
on which /(P) is summable. Here, if f(P) is summable on @, it is 
also summable on any measurable part 2’ of %, and if @ is split 
into a finite or denumerable number of disjoint sets %;, then 9(Z) 
is equal to the sum 9(@,) (it is completely additive).We shall next 
consider the properties of completely additive functions given in any 
manner, and not necessarily as an indefinite integral. Thus, let o(@) take 
finite real values for sets belonging to some family C of sets of some clos- 
ed field of sets 7’, containing all closed and open sets. We assume here 
that, if belongs to C, every part of & belonging to T also belongs 
to C. Moreover, we assume that ¢(%) is completely additive, i.e. if 
belonging to C is split into disjoint sets #,, the number of which is 
finite or denumerable, where all the Z; belong to T and hence to C: 


= >F,, (2) 
k 
then 
9 (F) = > P(E). (3) 
k 


When the number of terms is infinite, the series written must be 
absolutely convergent. In the case of (1), the closed field 7 is the 
field Lg, and the family C consists of all the sets J of Lg on which 
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/(P) is summable. The most important case for what follows is that 
when C consists of some @, belonging to Lg and the sets 2 of Lg 
which belong to &,. In this case C itself is obviously a closed field. 

Notice that, if y(@) is defined for sets of 7’ belonging say to some 
closed interval 4, it can be defined for all sets J of T by the formula 


g (F) = (GA). (4) 


Here, it will take finite values and will be completely additive on 
the whole of the field 7. In future, when we speak of 9(@), we shall 
naturally assume that  € C. In view of the additivity, we must 
have g(%) = 0 if @ is the empty set. It follows at once from the 
additivity that, if 7’ and &” belong to C and 2’ c 2”, then 


p (F" — €') = 9 (") — p(B’). (5) 


Further, it follows from the complete additivity that, if Z,, 
(7 = 1, 2, ...) is a monotonic sequence of sets of C and if the limit 
set & also belongs to C, then ¢(%,) > g(@). In the case of a non- 
decreasing sequence of sets we have % = %,+ (%, — @,) + (F, — 
—@,)+... and, in view of the complete additivity, g(%) = 
= 9&1) + (oF) — o(F1)] + [p(Fs) — (J+ --, ie. G(Bn) > 
-> 9(%). In the case of a non-increasing sequence of 2%, the proof is 
similar, being necessarily of C. Notice further that a finite linear 
combination of completely additive functions: c,p,(%) + e.(8) + 
+... + ¢)9,(%), is obviously also completely additive. We turn to 
the proof of the theorems fundamental to the theory. 

THEOREM 1. The absolute value of o(@) is always bounded by the 
same number, whatever the set © belonging to any set 8, of C. 

We use reductio ad absurdum. If this is not the case, there exists 
an 8, ¢ &, such that | g(%,) | > 2 and | g(%, — &,) > 2. We have 
to take into account here that 


¢(F,) = (G2) + v (8, — &), 


and that the fact of one term being unbounded implies that the other 
is unbounded, since 9(%,) is a given number. The theorem is not 
fulfilled for , or ,— %,. We can assume that it is not satisfied 
for &,, and there exists an 2, c @, such that | 9(%,)| > 3 and 
| p(&, — &3) | > 3 and so on. We have: 8, c #, c #3... and, on 
writing J = %,%,..., by what has been said, 9(%,) > g(%), which 
is absurd, since 9(@,,) is indefinitely increasing in absolute value. Thus 
the theorem is proved. 
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Let 6 be some subdivision of # into a finite number of %;,. We form 
the sum 


ts = >| (8) | (6) 
k 


and show that the set of values of t, is bounded for any 6. Let &3 
be the sum of the @;, for which 9(%,) >0, and 33 the sum of the 
&, for which 9(#,) <0. Since o(#) additive, we can write 


ty = 9 (Fs) — (Ss). (7) 


On also observing that 33+ 43 =, so that o(%) = o(%$) + 
+ g(%5), we can rewrite (7) as 


ty = 2 (3) — p(F) = y (F) — 29 (SF). (8) 


We write 7(%) and oF ) for the strict upper and lower bounds of 
p(e) if e c &, the empty set being also assumed to belong to @: 


9(%)=supp(e); v(%)=infg(e); (e€s). (9) 


By Theorem 1, we can say that g(@) and g(@) are finite. It follows 
from the first of formulae (8) that ¢ is bounded for any choice of 
5: ts < 29(8) — o(&). The strict upper bound of the sums ¢, for 
all possible subdivisions 6 is called the total variation of o(#) on 
the set 2. We write it as 9(%). If 6, is a sequence of subdivisions 
such that ¢, tends to y(%), it follows from the first of formulae (8) 
that (#3) now tends to ¢(%), whilst the second of formulae (8), 
which can be rewritten as 


27 (F3,) = (f) =? te, 
shows that (23,) > 9(#), so that (8) gives in the limit, with 6 replaced 
by 6n: 
® (8) = 25 (8) — 9 (8) = 9 (&) — 29 (8), 


whence 
(8) =4[G(8) +9); 9(8)=—-F1H8)—eB)I, 
& (f) =5 (8) — 98), (10) 
9 (F) = 5 (8) + 9(8) =F) —(— 9 8). (11) 


It follows from the definition of 9(%) and g(#) that o(%) > 0 and 
pf) <0. The functions 9(%) and —9(%) are usually called the 
positive and negative variations of g(@) on @. 
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THEOREM 2. The positive, negative and total variations are completely 
additive functions on C. 
Given any subdivision of # into ;, we form the series with non- 


negative terms: 
S= S5Ey, 
k 


and show that S = 9(@). First of all, we show that S < +00, For, 
if we had S = +00, given a suitable choice of e, C &,, the sum 


= P (ex) 


would take as large a value as desired. 
But this sum is equal to y(e), where 


a (ec é@), 


and we have arrived at a contradiction with the assertion of Theorem 1. 
Given «> 0, we have g(e) > S — « for a suitable choice of e,, so 
that all the more g(Z@) > S — e, whence, since ¢ is arbitrary, p(F) > S. 
Let us prove the reverse inequality. We choose e C @ so that y(e) > 
> GF) — e; let e, = eF,. We have 


p(e)= Splex), ie. ~(F)—e< Soler) 
k k 


whence all the more: 


(Ff) —e< So (Fr), 
k 
and, since « is arbitrary, (2) < > ¢(%), and finally 
k 
7 (2) = SHE). 
k 


Similarly, the negative variation is completely additive, so that the 
same is true of the total variation, by (10). Equation (11) shows that 
every completely additive function is the difference between com- 
pletely additive non-negative functions. Notice further that, if we 
had used a subdivision of # into an infinite number of sets to form 
the sums (6), the previous strict upper bound would have been 
obtained. 

Any point (closed set) belongs to T. If F € C, any point P of & 
belongs to C, and we can speak of the value g(P) of the function 
g(@) at the point P. If o(P) 4 0, P is called a point of discontinuity 
of y(%). Otherwise, it is a point of continuity of y(#). If y(P) > 0, 
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it follows from the definitions given above that 9(P) = 9(P) and 
g(P) = 0, whilst if o(P) <0, then g(P)=0 and 9(P) = 9(P). 
If »(%) is continuous at P, 9%) and g(#) are also continuous at P. 
Since ¢(@) and g(@) are finite, there is a finite number of points of 
discontinuity belonging to ¥ and such that g(P) > a or g(P) < —a, 
where a is a given positive number; also, the number of all the points 
of discontinuity is finite or denumerable. Let these points be P,. 
If the set of P, is denumerable, the series formed from the g(P,) is 
absolutely convergent. We introduce a new set function, defined on 
the family C: 

= + 9 (Py), (12) 

Pree 

where the summation is over the points P, of &. This function is 
also completely additive. It is called the jump function. The difference 


9. (F) = p(F) — va (B) (13) 


is a completely additive function with no points of discontinuity 


73. Singular function. In future, the field 7 will be the field Lg. 
As a matter of fact, not every function g(@), completely additive on 
a family C of Ig, can be written as an integral (1). 

We shall prove later the following fundamental theorem, which 
we shall shortly make use of. 

THEeorEM. Every function o(%), completely additive on C, can be 
expressed for all sets & belonging to any fixed set F, of C by the formula 


g(@) = 9(@H) + § f(P) G (a8), (14) 
é& 


where H is a definite set of & , such that G(H) = 0, and f(P) is measurable 
and summable on &,. The term (ZH) is called the singular part of 
g(&). The singular part is defined by the values of (#) on sets of 
measure zero. The second term, which we call the absolutely con- 
tinuous part, vanishes on any set of measure zero. We now show that 
the expression as the sum of a singular and absolutely continuous 
partis unique. Suppose we have, for % of C that belong to &,, in addition 
to (14): 
g(%) =p (GH,) + { f,(P) Ede), 
¢ 


where G(H,) = 0. We have from this formula and (14): 
g (€H) — 9 (@H,) =JShl (P) G (d&) — = te )G (dé). 
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We replace & by the set H + #H, belonging to @,. On observing 
that G(fH + FH,)=0, so that the integral over SH + H, is 
zero, and that (®H+ ¢H,)H=@H and (fH + £H,)H, = €H,, 
we get (fH) = 9(FH,), whence it follows that the absolutely cont- 
inuous parts must be the same, i.e. 

ff (P) @ (d&) = ff, (P) G (dé). (15) 
é é 

To prove the theorem, we start from an arbitrary but fixed set 
%, belonging to C and assume that all J c &4, as is stated in the 
theorem. When decomposing ¢(@) into a singular and an absolutely 
continuous part, we started from a set @, and assumed that 
the whole of 2 belonged to Jy. We thereby obtained a unique decom- 
position. If we had started from some other set 24, different from 
@, and belonging to C, the earlier original decomposition would 
evidently have been obtained for all sets belonging simultaneously 
to &, and &%. For, we should otherwise obtain two different de- 
compositions of g(#) for sets belonging to the product 25 = F&4, 
which also appears in the family C, and this is impossible, as we 
have seen above. 

We can say, in the sense indicated, that the decomposition of 9(@) 
into a singular and an absolute continuous part is unique in the 
whole of the family C. 

Let us show that the /(P) appearing in the integrand in (14) is 
well defined, on the assumption that functions equivalent with 
respect to G(#) are identified in the usual way. We have to show 
that, if (15) holds for all that belong to %, p(P) =f,(P) — f(P) 
is equivalent to zero on @,. 

Let & be the part of 2, where y(P) > 0, and = @,— Ze. 
The sets Ff and #5 belong to C, and we have, on replacing & in (15) 
by @j and J: 

f p(P)G(ds) = f y(P)G (as) =0, 

£3 fo 
whence it follows that y(P) is equivalent to zero on &f and Z,, 
and hence on @,. If we form the function f(P) for two sets , and &} 
of C, these two functions will be equivalent on 25 = %,%5 as above. 
In this sense, we can speak of the uniqueness of the function f(P). 
If, for instance, all finite intervals belong to C, on applying the 
foregoing arguments to the widening intervals —n <2 <+ n;—n < 
<y<+n (n=1, 2, ...), we define f(P) uniquely on the whole of 
the plane. The function /(P) is usually called the derivative of 9(Z) 
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with respect to G(@). Let k§ be a circle (or sphere) with centre P 
and radius e. It can be shown that, for all P, excepting possibly a 
set of measure zero witb respect to G(%), the ratio (k®)/G(Ke) 
tends to a function equivalent to /(P) with respect to G(@) as « tends 
to zero. Obviously, it is assumed here that ¢(%) is defined on the 
circle k$ for sufficiently small «. We shall make no future use of 
this assertion and omit the proof. 

Derrinition. A function o(@) ts said to be absolutely continuous 
with respect to G(&) if, given any fixed &, of C and any « > 0, there 
exists a positive n such that | g(e)| << « if e€ F, and | G(e)| < 4. 
If o(&) is absolutely continuous with respect to G(#), obviously 
go(F) = 0if F € Cand G(@) = 0. The second term of (14) is an abso- 
lutely continuous function on C, as we know. Conversely, if we know 
that »(%) is absolutely continuous, then ¢(2H) = 0, since G(fH) = 0 
and g(@) is expressible by 


y (8) = Jf (P) G (dé), (16) 
é 


ie. the singular part is absent. This discussion leads to the following 
corollary of the fundamental theorem. 

Cornotiary. If o(f) = 0 for GF) = 0, then g(&) is expressible by 
(16) and is absolutely continuous with respect to G(f) on any set &, 
of C. 

Notice that, if G(#) is not continuous, the g(%) given by (16) is 
also in general not continuous. If, for instance, G(P,) = a 4 0, then 
g(P,) = af(P,). But 9(%) is absolutely continuous with respect to 
G(@) in the sense indicated above. 

If G(@) is continuous, the g(#) given by (16) is obviously con- 
tinuous. If G(4) is the area of the interval J, so that Zg is the 
field L of Lebesgue measurable sets, (14) becomes 


p(f) =9 (fH) + {I f(x,y) da dy, 


where H is of Lebesgue measure zero. Equation (16) becomes 


g(%) = {J f(x,y) dady, 
i 


and in this case g(@) is obviously continuous at every point. 

The strict upper bound of the values of y(e) for sets e belonging 
to & is obviously obtained in the case of (16) if we integrate /(P) 
over a set for which /(P) > 0, and the strict lower bound is obtained 
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if we integrate over a set for which /(P) < 0. We thus have the 
following formula for the positive, negative, and total variations of 
the o(#) given by (16): 


¢(F) = fit (P)GdB); v(%) =f f-(P) (de); 
& é 
@(f) = § | f(P)|G (de). (17) 
f 


If we extract the jump function ¢4(Z) from ¢(@) and apply decompo- 
sition formula (14) to the remaining continuous function, we get a 
decomposition of »(#) into three terms: 


p(B) = %4(F) + (FH) + § f.(P) G (de). (18) 
& 


74, The case of one variable. Let Z be e.g. the finite interval 
{a, 5]. We naturally introduce, instead of G(@), the corresponding 
non-decreasing point function g(x). We introduce, in place of the 
set function the point function w(x) = y([a,x)], and formula (14) 
for it takes the form 


ow (x) = p([a,x]H)+ Jf f(x) dg (2), (19) 
[a,x} 
and for the Lebesgue integrals: 


x 


w (22) = 9 ({a, x] H) + § f(x) de. (20) 


a 


When 9(@) is absolutely continuous we have the formula 


x 
w(x)= § f(x)dg(z) and w(x) = { f(#)da. (21) 
fa, x] a 

Let us consider in greater detail the case of the Lebesgue integral. 
When passing from an interval function to a point function, constants 
can be added to the latter, and we can write 


w (2) =f f(z)dx + (a). (22) 


where /(x) is a measurable function, summable over [a, b]. Since the 
integral of /(x) is absolutely continuous, function (22) has the follow- 
ing property: given any « > 0, there is a corresponding 7 > 0 such 
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that, if (a,,6,) (& = 1,2, ...,) are non-overlapping intervals for 
which 


n 


> (54 = &%) <9; (23) 
kel 
then 
> [c (By) — @ (a)]| <e- (24) 


k=l 


We shall start out from a point function, and say that the 
point function (zx), defined on the interval [a, 6], is absolutely 
continuous on this interval if it has the property just indicated. 
Obviously, an absolutely continuous function is simply continuous, 
since we can take in particular n = 1. As we shall see later, there 
exist monotonic continuous point functions that are not absolutely 
continuous. The following property is a consequence of the one 
described: given any « > 0, there is a corresponding 7 > 0 such that, 
if (23) is fulfilled, then 


So (b) — 0(y)| <e. (25) 


k=1 


In fact, if w(x) has the above property (24), i.e. is absolutely con- 
tinuous, there is an 1 corresponding to the given « such that, when 
(23) is satisfied, 


nr 
> [0 by) — @(a)] | <4 - (26) 
k=1 
We can split any system of intervals (a,, b,) satisfying (23) into 
two classes, where class I contains the intervals for which o(b,) — 
w(a,) > 0, and class II those for which w(b,) — w(a,) < 0. We have 
by (26): 


21% (bx) — @ (a4) | = Flo (by) — oH) < 
I 

1 Mig) = OAs) | > [@ (2%) — @ (ay)] <3 
ia 


whence (25) follows. Since the terms of sum (25) are non-negative 
and 7” is arbitrary, we obtain the following property of absolutely 
continuous functions: given any ¢ > 0, there is a corresponding 
1 > 0 such that, if (a,, b,) is a finite or denumerable set of mutually 
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non-overlapping intervals satisfying the condition 


> (bk — %) <9, (27) 
i 


then 


S/o (6) — 0 (a) | <e. (28) 
k 


Conversely, if this condition is satisfied, the original condition (24) 
is all the more satisfied, and w(z) is absolutely continuous. 

THEorEM 1. The sum, difference and product of two absolutely cont- 
tinuous functions are absolutely continuous functions. The quotient 
@,()/w,(x) of two absolutely continuous functions is also absolutely cont- 
inuous, provided w,(x) does not vanish. 

We shall only give a proof for the product w,(z) w,(x). The individual 
functions are bounded in [a,b], ie. | @,(x) | <1, and | (zx) | < d,. 
We have 


| cy (Dy) © (By) — © (Ax) ©, (Ax) |< 
< | og (Bx) | | (Bx) — @y (Ax) | + | 4 (Ax) | | 2 (Bx) — @g (Ax) | < 
< 1, | w, (by) — @, (ax) | FF 1, | @, (by) — @z (ax) | : 


On summing over & and taking into account the absolute continuity 
of w,(x) and w,(xz), property (25) follows for the product. 

THEOREM 2. An absolutely continuous function w(x) is of bounded 
variation, and its total variation v(x) is also an absolutely continuous 
function. 

Let 7 be a positive number such that, when (23) is satisfied for 
1 = No, we have 

S| (b) — @ (a) | <1. (29) 
k 


We subdivide [a, b] by fixed points a= cy << ¢,< ... < Cy, < 
< cy =b such that c, — cy, < yy (k= 1,2, ...,N). Given any 
subdivision of the interval, we have (29) for [c,—,, cx], and the sums 
t;, and hence the total variation of w(x) on each of the [cx_,, cx] 
is not greater than unity, and not greater than N on the whole of 
[a, b]. Let 7 correspond to «, so that, when (23) is satisfied, (25) is 
fulfilled. Let us subdivide each of the [a,, b,] appearing in condition 
(23). The sum of the lengths of the sub-intervals obtained will satisfy 
condition (23) as before, and sum (25) corresponding to the sub- 
intervals will be < «as before. The strict upper bound of the sums of 
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terms corresponding to the sub-intervals of [a;, 0,] obviously gives 
v(b,) — v(a;,), and thus we have, when (23) is fulfilled: 


S [oh — 0 (a)] <e, 
k=l 


whence it follows that v(7) is absolutely continuous, and the theorem 
is proved. 
By forming the functions 


©, (2) = [0 (a) + o(2)}; (a) =F [o(e)—@(a)], (80) 


which are not decreasing [8] and are absolutely continuous by Theorem 
1, we can write w(x) as the difference between two non-decreasing 
absolutely continuous functions: 


@ (a) = 0, (@) — wy (2). (31) 


As we have mentioned, the indefinite integral (22) of the summable 
function f(z) yields an absolutely continuous point function «w(z) 
in the sense of the ahove definition (23), (24) or (23), (25). Let us 
now prove the converse. 

THEOREM 3. Every absolutely continuous function w(x) is expressible 
as the indefinite integral 


w (x) = { f(x) da + w(a). (32) 


By using the function w,(z) and putting (2) = @,(a) for z < a, 
and w,(z) = w,(b) for x > b, we can associate the interval 4 [a, £] 
with a non-negative number 9,(4) = w,(8) — (a), where it is 
unimportant whether 4 be open or closed, in view of the continuity 
of w,(z). If the linear set @ is Lebesgue measurable, there exists 
an open set O containing such that the set (O — &@) can be covered 
by a finite or denumerable number of intervals [a;, b,], the sum of 
the lengths of which is as small as desired [35]. Since w(z) is absolutely 
continuous in [@, 6] and is prolonged by a constant outside [a, 5], 
we can carry out this covering in such a way that the sum of the 
non-negative terms w,(b,) — w,(a,) for the covering intervals [a,, b,] 
is as small as desired, i.e. if Z is Lebesgue measurable, it is also 
measurable with respect to w,(z). We can thus carry over 9,(4) to 
all sets of Z that belong to [a, 6], with the property of being completely 
additive. It also follows from the above discussion that, if the Lebesgue 
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measure m(Z%) = 0, then 9,(2) = 0, so that we have 
F1 A= fh (x) da. (33) 
€ 


Similarly, by forming ¢,(4) = w,({f) — w,(a), we get 
P2(F) = § fy (w) dz, (34) 
é 


where /,(x) is summable on [a, 6] and 


9 (F) = 9% (F) — 9. (F) = J [A (@) — f(a) dx = J f(x)dx. (35) 
¢ é 


If we take the interval [a, x] as 2, we arrive at (22). 

The f(z) appearing in (22) can be shown to be uniquely defined 
apart from an added term which is zero almost everywhere. For, if 
we had a second formula of type (22) for w(x) with the integrand 
g(x), the integral of f(z) — g(x) over any interval belonging to [a, b] 
would be zero, and we could assert, by property 11 of [52], that the 
difference in question is equivalent to zero. The f(x) in (22) is called 
the derivative of w(x) and is usually denoted by f(x) = w’(z). It can 
be shown that, for all x of [a, 6] excepting possibly a set of values of 
Lebesgue measure zero, the limit exists: 
iin w(x +h) —o (x) — F( 
h—0 A 


x), 


where F(x) is equivalent to f(z). We omit the proof. If f(x) is a cont- 
inuous function in [a, 6], there exists for all x of [a,b] the ordinary 
derivative w’(x) = f(x) of the integral with respect to the upper 
limit. If /’(z) is absolutely as well as just simply continuous in [a, db], 


we obviously have 


w' (x) =f (w) = fh (a) de +0, 
where A(x) is summable. We have f’(x%) = A(x), and h(x) is called 
the second order derivative of w(x) and is written as usual as A(x) = 
= o"(z). Similarly, w(x) can have absolutely continuous derivatives 
up to order & and hence a summable derivative of order (k + 1). 
It is now expressible as 


w (2) = {dx fae. be § w+) (2) da + w (a) + 


w’ (a) 
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All the above theory is readily extended to the case when w(2) is 
absolutely continuous with respect to the non-decreasing function 
g(x), which we shall assume continuous, i.e. given any « > 0, there 
is a corresponding positive 7 such that, if (a,, b,) are non-overlapping 
intervals, for which 


Slo) — 9a] <n, (36) 
k= 
then 
> [@ (by) — w (ay)] | <e. (37) 
k=l 


Precisely as above, we can write (28) instead of (37), and w(x) is 
a continuous function in [a, 6}. Instead of (32), we have 


x 


w(x) = ff (x) dg (x) + @ (a). (38) 


a 


75. Absolutely continuous set functions. We return to the general 
case of sets on a plane and consider in more detail the trans- 
formation performed by (16), on the assumption that /(P) is non- 
negative and summable throughout the plane. If f{(P) is defined 
and summable on some measurable set #5, on continuing it by zero 
outside %,. we obtain a function summable throughout the plane. 
Formula (16) defines a completely additive function ¢(%) on the 
field Lg. This function is thereby defined for all semi-open intervals, 
and we can continue this function of an interval g(A) on the field 
L,, a8 we did earlier as regards G(J). 

THEOREM 1. Every set J of Lg belongs to L,, and (16) gives the measure 
g(&) of this set, obtained when g(A) ts continued as indicated. 

Every open set O is the sum of a denumerable number of non- 
overlapping semi-open intervals 4, (kK = 1, 2, ...). On summing both 
sides of 


y (Ay) = § f(P) G (dé) 
4, 


over k, we obtain on the left the measure ¢(O) (since the measure is 
additive), and on the right the integral over O (because the integral 
is additive), i.e. (16) gives the measure ¢(O) of any open set. On observ- 
ing that any closed set F is the difference between the entire plane 
&’ (an open set) and some open set O (0 = &’ — F), and subtracting 
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term by term the formulae 


g (B') = nee ); p(O) = Bree (dz), 


we find, as above, that (16) gives the measure 9(/) of any closed 
set F. 

Let & be a set of Lg and é, a sequence of positive numbers tending 
to zero. We know that there exist sequences F, and O, of closed 
and open sets such that F, c  c O, and G(O, — F,) = G(O,) — 
— G(F,) < en. Since integral (16) is absolutely continuous, 9(0, — 
— F,)-> 0, so that & belongs to ZL, [38]. Here, the measure of & 
in L, is obviously the limit of ¢(F,) or 9(O,), i.e. the limit of the 


integrals 
{f (P) @ (dB) 
Fn 


where F,, c & and G(f — F,,) — 0; also, since the integral is absolutely 
continuous, this limit is the integral over %, ie. the measure of F 
in LZ, is given by integral (16), and the theorem is proved. The theorem 
still holds under the assumption that the non-negative function /(P) 
is summable only with respect to some bounded set, and the proof 
remains the same in essence. The next theorem gives a more precise 
idea of the constitution of L,. 

THEOREM 2. The necessary and sufficient condition for a set & to 
belong to L, is that S can be expressed as the sum 


F = Fo + FO, (39) 


where F™ € Lg and we have f(P) = 0 at points P of F. 
Necessity. Let belong to L,. We introduce the sets 


F =F [f(P) = 0]; F,=8 [fF (P)> 1 Z,, =8[— <f(P) <==5| (40) 


and also write 2’ for the set of points at which f(P) is not defined 
or is equal to (-+°o). The set &’ is measurable with respect to G(#) 
and G(%’) = 0. The same can be said of any part of 2’. The function 
{(P), measurable with respect to G(#), is therefore also measurable 
with respect to 9(%), and all the sets (40), like &’, belong to Z,. 
We further define the sets 


FO 88 + > F,; F=FB,. (41) 


nel 
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Formula (39) holds here. At points of @), f(P) vanishes, and we 
have to show that #“) € Lg. The set &’ has measure zero with respect 
to G(@), and it is sufficient to show that the sets %,, are measurable 
with respect to G(#). Since the 2%, are measurable with respect to 
g(%), there exist closed sets F,, and open sets O, such that 


F,c @@, CO, and (0, —F,)<—, (42) 


where « is a given positive number. We form the sets 
D, = €,(O, — Fn) = €,0, — FF (43) 


ie. D, is the set of the points of 0, — F, at which l/n < f(P) < 
< 1/(n — 1) or f(P) > 1 (with nm = 1), Since O, — F,, € Lg, and F(P) 
is measurable with respect to G(Z), the sets D, € Ig. Further, since 
F, ¢ €&,, it follows that F, c %, and %,F,, = F,,. Since $F, < On, 
it follows that 7%, CO,%, and, by (43), %, — Fy C Dry, 80 that 
|FF,— Fnlg <|Dnlg. But the set D, € Lg, and we can write 
| Da la = G(Dp), i.e. 


|BF, — Faq < G(D,). (44) 


Further, in view of (43) and 2,,F, = F,, we have D, < O,— Fy, 
and we get, on making use of (42): 


[P(P) 4 (dB) = 9 (D,) < 90, — F,) <=. (45) 
Dn 


The set D, enters into %, by (43), and f(P) > 1/n on the latter 
set. We thus have 


[fF (P) 6 (a8) > — G(D,), 
Da 


and inequality (45) leads to 
G (Dy) es € 


n n’ 


ie. G(Dn) < ¢. By (44), we get |Z@%, — Fale < & and, since « is 
arbitrary, this shows us that 2%, € Lg; the necessity of condition 
(39) is proved. 

Let us prove the sufficiency. Given (39), where % ¢€ Lg and 
}(P) = 0 at the points 7, we have to show that #¢€ L,. The set 
& € L, by Theorem 1. It remains to prove the same for #). The 
set H, of all points at which f(P) = 0, is measurable with respect 
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to G@), so that H€L,, and, by (16), g(H) =0. But SE" CH, 
so that J“ is measurable with respect to g(#) and has measure zero. 
The theorem is thus proved. Since $°) c H, so that 9(%™) = 0, 
we can assert that (%) = ¢(%), ie. to evaluate g(%) we have 
to use (16) with & replaced by #@). A further remark: in (39), all 
the points of the set &"?, at which f(P) = 0, can be referred to &. 
The set of these points €%, is measurable with respect to G(@). 

We shall now prove a theorem that enables us to reduce a Lebesgue-— 
Stieltjes integral with respect to »(@) to an integral with respect 
to G(Z). 

THEOREM 3. If F(P) ts defined, measurable with respect to o(@\ and 
summable on a set & which is measurable and of finite measure with 
respect to g(&), the product F(P) f(P) is measurable with respect to G(Z) 
in &, and we have 


J F(P) 9 (d&) = \ PIP) F(R) G (dB), (46) 


ea 
which may be written in the form 


SE) SENG Ue)) ae ade). (46,) 
& dé 


ea) 


On continuing F(P) outside @ by zero, we can assume that F(P), 
like {(P), is defined everywhere. In addition, we can assume that 
F(P) and f(P) have finite values at every point. The function F(P) 
is measurable with respect to g(@) and /(P) is measurable with respect 
to G(®), and hence also with respect to ¢(@). We introduce the new 
function F,(P) by putting F,(P) = F(P) if {(P) 4 0, and F,(P) = 0 
if f(P) = 0. In other words, F,(P) = F(P) wp(P), where wy(P) is 
the characteristic function of the set H of points at which f{(P) = 0. 
As mentioned, H € Lg, so that H € L,, ie. both F(P) and w;,(P) 
are measurable with respect to 9(@), ic. F)(P) is also measurable 
with respect to ¢(Z@). Let us show that F,(P) is also measurable 
with respect to G(#@). Since F,(P) is measurable with respect to ¢(@), 
given any a, the set &, of the points at which F,(P) >a can be 
written as ¥, = 2 + F°), where F? € Lg and f(P) = 0 at points 
of ®, If a > 0, by definition of F,(P), the set ¥?) is absent, so 
that J, € Lg. If a < 0, the set 2, contains the whole of H, and, 
as mentioned above, we can assume in this case that J coincides 
with H, ie. that f(P) > 0 at all points of & @) But H € Le, so that 
$F, =F +4 H ¢€ Lg. Hence &, € Lg for all a, F,(P) is measurable 
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with respect to G(@), and the product F',(P) f(P) is therefore also 
measurable with respect to G(#). We return to the set f () mentioned 
in the theorem. At points of this set the product F,(P) f(P) coincides 
with F(P) f(P), ic. F(P) f(P) is measurable with respect to G(@) in 
%. We shall confine ourselves in the proof of (46) to the case when 
F(P) is bounded. The proof is very similiar for unbounded functions. 
Let | F(P)| < £ and &,, be the set of all P of & at which 


k k+1 
SDR) ee ee, 


(= —2",—27+41,...,2"—1). 
We form the piecewise constant functions 


F,(P) =, Ly if PEB ny. 

The sequence F,(P) increases as ” increases and is bounded in 
absolute value oe the number L. We have 
21 


ba fu, 
=—?2' ey, 


where & oe x is obtained from %,,,, in accordance with (39). On summing 
over all the & in question, we get the set £, and, on taking kL/2” 
under the integral sign in the last formula, we arrive at 


{Fn (P)¢ (de) = § F, (P) f(P) G (dé). 
& ei) 


The integrand on the right-hand side has an absolute value not 
exceeding the integrable function Lf(P), and we can pass to the limit 
under the integral sign on both sides, which leads us to (46). If & 
is measurable with respect to G(#) also, we can replace #") by & 
in (46), since & = ¥ — F™ and f(P) = 0 on ¥™, 80 that the integral 
over S) vanishes. 

The proofs of the last three theorems are taken from Stone’s Linear 
Transformations in Hilbert Space and Their Applications To Analysis. 

If G(A) is a function of bounded variation, we can use the canonical 
form as the difference between two non-negative functions: 


G(@) = G, (@) — G, (8) 


and apply the theorem to G,(@) and G,(@). We have here, instead 
of Lg, the field Ly, where V(Z) = G,(f) + Gy#). We also obtain a 
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representation of g(@) as the difference between two non-negative 
functions ¢(4) == 9,(4) — 9,(4), and instead of DL, we have Ly,, 
where V,(4) = 9,(4) + 9,(4). We can consider similarly the case 
when /{(P) changes sign. Here we have to write {(P) as the difference 
between a positive and negative part: 


MY H=Ph)=P?) 


and prove the theorem and formula for each separate term. Now, 
g(A) is expressible as the difference between two non-negative functions, 
and the field of sets L, is replaced by the field of sets measurable 
with respect to the function 


v (8) = S| f(P)|@ de). 
¢ 

The theorem can similarly be extended to the case of complex 
functions /(P) and G(@). 

Let us consider further the form that the theorem takes in the 
case of one variable. Let g(x) be a non-decreasing and bounded 
function on the finite interval [a,b] and f(z) be non-negative and 
summable with respect to g(x) on [a, 6]. We consider the function 

w(x) = J f(x) dg(z). 
{a, x] 

Every set measurable with respect to g(z) will also be measurable 
with respect to (x), and the necessary and sufficient condition 
for the set & to be measurable with respect to w(z) is that it can 
be written in the form (39), where Z® is measurable with respect 
to g(z) and /(x) = 0 at all points of ¥°. If F(x) is summable with 
respect to w(x) on the set 7 measurable with respect to w(x), we have 


prea § f(x) dg ei JF (@) f @) dg (x), (47) 
& 


[a, x} 


where $ is the part of F appearing in (39). If F(x) on # is measurable 
with respect to g(x), & ® can be replaced by &. We obtain when 
g(a) = 2: 


fre@al fre) a| = | F (a) f (x) dx. (48) 
& a Faty) 


Hence, if w(x) has no singular term, the Lebesgue-Stieltjes integral 
of F(x) with respect to w(x) is given by the Lebesgue integral. If 
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is the interval [a, b], (47) and (48) can be written as 


| i) Heyeat2)| = ae ) f (a) dg (x), (49) 


{a, b] [a, x] 
6 x 
fP(aal f/te de] — [ ¥ ef (=) ae, (49,) 


where e.g. in the latter formula F(x) is assumed summable with respect 
to the indefinite integral of f(z). Suppose that ®(x) and W(x) are 
absolutely continuous functions, i.e. 


=O (a)dz4+O; Play =fP"'(zdz+0,, (60) 


where ®’(x) and ¥’(xz) are summable on [a, 5]. On using (49,), we can 
write: 


o (a) YW" (2) da + fv (x) ®' (x) da = fo (x) dY (a) + fy (x) d® (x), 


where the integrals on the right are ordinary Stieltjes integrals, 
since ®(x) and Y(z) are continuous and of bounded variation. We have 
the formula for the right-hand side [2]: 


fo (x) AY (x) + (ez (x) dD (x) = [© (a) F (x) E24 


and substitution in the previous formula gives us the formula for 
integration by parts: 


b 
fo (x) (a) da + | ¥ (x) ®' (x) da = [G (x) (x)Jz=9. (51) 
It follows at once from (50) that, in the case of the sum (x) + 
+ W(x), the integrand is equal to ®’(%) + W’(x), ic. [O(xz) + P(x))’ = 
= ®’(x) + ¥’(z). On putting 6 = x in (51), we get 
® (x) (x) = fl x) Y (x) + D (2) W" (x)] dx + D (a) ¥ (a) , 


i.e. 
[D (x) W (a)]’ = D’ (ax) W (x) + D (x) W' (a). 


The case of an infinite interval can be treated similarly. 
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76. Example. We shall now give an example of a non-decreasing 
continuous function which is not absolutely continuous, and for 
which the second, i.e. absolutely continuous, term is absent in (20). 
(20). We start by forming a closed set F, on the interval [0, 1]. 
We split [0,1] into three equal parts by the points 1/3 and 2/3, then 
remove the central open interval (1/3, 2/3). We divide each of the 
remaining intervals [0, 1/3] and [2/3, 1] into three equal parts: the first 
by the points 1/9 and 2/9, and the second by the points 7/9 and 8/9. 
We then remove from each of these intervals the central parts (1/9, 
2/9) and (7/9, 8/9). Each of the remaining intervals: 


ob) Fe Hk He Hh 


is again divided into three equal parts and the central open interval 
removed, and so on. Thus all in all we remove from [0, 1] a denu- 
merable number of open intervals having no common points or even 
common ends: 


2) Ao ee eee 
(s-. +} oe) (> Hf 7? or): (az: a) (sr 7) 
6 


i.e. we remove an open set Hy, the set that remains, which we write 
as F',, being closed. One open interval of length 1/3 is removed in 
the first step, two of length 1/3? in the second, 2? of length 1/3? in the 
third, and in general, 2”~* intervals of length 1/3" in the nth step. 
The Lebesgue measure of the open set H, is thus equal to 


> = a | 
gn 2 
and the set #', remaining on the interval [0, 1] thus has measure zero. 


We now define a function f(z) on [0, 1] as follows: we put 


a ; ] 2 
1, 1 2 3 F 7 8 
f (x) =a if ne(=. +)3 f (x) =a if xe(=, +): 
in general, we put f(x) equal to 1/2", 3/2", 5/2", ..., (2" — 1)/2” in the 


sequence (from left to right) of intervals which we remove at the nth 
step. Thus f(z) is so far defined at points of the set H, and is constant 
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in each of the open intervals (51) of which this set is composed. 
We further define f(x) at the ends of [0,1] by putting f(0) = 0 and 
/(1) = 1. The principle in accordance with which we have defined 
f(z) on each of intervals (52) is as follows: in each interval of the 
set H, obtained at the nth step, we put /(z) equal to the arithmetic 
mean of its values in the neighbouring intervals obtained earlier, 
or at the ends of [0,1] if there are no previously obtained intervals 
of H, on one side of our new interval of H,. It follows directly from 
this that f(z} is a non-decreasing function on the set Hy. Let us 
continue the definition of f(z) on to Fy. Let 2, € Fy. Since F, has 
measure zero, there is a point of H, in any e-neighbourhood of 75, 
and if x approaches x, from the left in the set Ho, f(x) is non-decreasing 
and has a limit, which we take as the value of f(z) at x = 2. In other 
words, our definition amounts to this: we take f(z) equal to the 
strict upper bound of the values of f(x) for z less than x, and belonging 
to H,. At x = 1, this definition obviously leads to the previous value 
{(1) = 1. We have thus defined throughout [0, 1] a function which 
is clearly non-decreasing. It may easily be seen to be continuous. 
For, if it had a discontinuity at 7 = x’, at least one of the intervals 
[f(x’ — 0), f(x’)] or [f(x’), f(x’ + 0)] would not reduce to a point and 
would not contain values of /(x) inside itself because /(z) is monotonic. 
But the values of f(x), defined above only on the set Hy, are everywhere 
dense in [0,1], and we have arrived at an absurdity by assuming 
a discontinuity of f(x). We recall that f(x) is constant on each of 
intervals (52). On the basis of the non-decreasing continuous function 
{(x), we can form a completely additive non-negative set function 
g(%), which is always defined on B-sets. By what has been said, 
9(H,) = 0, and all the more, g(#) = 0 on every B-set forming part 
of H,. If we take the interval [0, x], we can write: 


[0, x] = [0,2] 2) + [0,2] Fo, 
so that 
f (x) = p([0, x]) = 9([ 0, x] Hy) + 9([0, x] Fo). 


The first term is zero by what has been said, whilst the measure 
of F, is zero, i.e. f(z) reduces to a singular part [74]: 


f (x) = ¢([0, 2] Fy), 


where F, plays the role of H in (20) and f(z) the role of w(z). 
Let us investigate further the set F'). The continuous non-decreasing 
function f(x) takes all real values from zero to unity. On each of the 
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excluded intervals, including the ends, f(z) is constant, the set of 
excluded intervals being denumerable. The set of all values of f(x) 
is not denumerable (has the power of a continuum). It is therefore 
evident that F', contains points different from the ends of the excluded 
intervals. It can be shown that F, has the power of a continuum. 


77. Absolutely continuous functions of several variables. Such functions can 
be introduced along the same lines as for absolutely continuous functions of 
a single variable (a point) [74]. We shall confine ourselves to functions of two 
variables. Let F(x, y) be a given continuous function in the two-dimensional 
interval A,[a < « < b;¢ < y < dj. With the aid of this we can form a function 
¢(6) of an interval contained in 4,, i.e. if 6 is the interval defined by 2, < « < 
< te, Y, SY < Y2, we put, as before: 


p (0) = F (22, yo) — F (24, Y2) — F (a2, Yi) + F (21, yn), (53) 


where it is of no importance whether 6 is open or not, since F(z, y) is continuous 
by hypothesis. If we add to F(x, y) the sum f,(x) + f,(y), in which the first 
term depends only on x and the second only on y, this has no effect on 9(4). 
The interval function (6) is said to be absolutely continuous if it satisfies 
a condition analogous to (24) of [74], ie. if, given ¢ > 0, there is a correspond- 
ing 7 > 0 such that, when the sum of the areas of the non-overlapping intervals 
dy (k = 1, 2,..., 7) is < 7, we have 


x (i) | <e- 


k=1 


DEFINITION. F(x, y) te described as an absolutely continuous function of two 
variables (x, y) if p(d) defined by (53) is an absolutely continuous function of an 
interval and if, in addition, F(a, y) and F(x, c) are absolutely continuous functions 
of y and x. 

The latter proviso regarding the absolute continuity of F(z, y) on the lower 
and left-hand sides of the interval A, is necessary because of the possibility 
of adding the sum f,(x) + f,(y) to F(x, y). We write down the obvious equation: 


F (2,y) = [F (@,y) — F (a,y) — F (0) + F (@,0)] + [F(a c) — F (a, 0)] + 
+[F (ay) — F (a,0)] + F (a0). 


The first term on the right-hand side is 9(4, ,), where by y is the interval 
ax2e’<2,c< y’ < y, and, as in [75], this function can be expressed as 
an indefinite double integral of a summable function. The second and third 
terms on the right-hand side are absolutely continuous functions of x and y, 
and hence are expressible as simple indefinite integrals. Thus every absolutely 
continuous function F(z, y) can be expressed as 


xy x y 
F (av, y) =S Sf (x, y) da dy + §g (x) da + Sh (y) dy + F (a,c). (54) 
ac Qa fad 
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It may easily be seen that, conversely, every function expressible by the last 
formula is absolutely continuous. We can use Fubini’s theorem to rewrite the 
last formula as 


x y ¥ 
F (x, y) = §[§ f(a, y) dy +9 (x)] dx + f hy) dy 4+ F (a, ¢), (55) 
ae ¢ 


or 


F (2, y) = SL ttswas+awlay+ fowas+ Fo c). (56) 


It is clear from this that, if F(z, y) is an absolutely continuous function of 
two variables, it is an absolutely continuous function of 2 for any fixed value of 
y, and an absolutely continuous function of y for any fixed x. The converse is 
false, i.e. a function may be absolutely continuous in each variable yet not 
be absolutely continuous in both. 

By the definition of [74], the integrands of the first terms of (55) and (56) 
yield the partial derivatives of the absolutely continuous function F(z, y): 


zal (x, eu oF a y) 


a (a, y) dy +9 (2); = fie y)daz+h(y). (57) 


The integrand in these formulae defines the second order mixed derivative: 


5S" - SEE" ]-100 


If the partial derivatives F, and F', are themselves absolutely continuous 
functions of two variables, we can define all the second order partial derivatives. 
If all these are absolutely continuous functions of two variables, we can define 
all the third order derivatives, and so on. 

It can be shown that the partial derivatives are the limits of the corresponding 
ratios almost everywhere in 4,. For instance, F, is the limit of the ratio 
[F(x +h, y) — F(x, y)Vh. An absolutely continuous function F(z, y) can be 
interpreted as a function of a point F(M) on the plane. If we introduce new 
Cartesian coordinates (x’, y’) in place of the old (x, y) on this plane, we obtain 
a new function F(x’, y’), which may be no longer absolutely continuous in the 
new variables. Let us take as an example the absolutely continuous function 


x 
F (x, y) = J f (t) de, 
0 


where f(é) is the continuous, but not absolutely continuous, function that we 
constructed in [76] for the interval [0, 1]. We continue it by assuming f(z) = 
= 0 for x < 0 and f(x) = 1 for x > 1. The above formula defines an absolutely 
continuous function F(x, y) throughout the plane (it in fact depends only on 
zx). On rotating the axes 45 degrees about the origin, we obtain in the new 
coordinates: 


1 
aa eer? 
F(a’,y’)= Jf f(tde, 


0 
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The partial derivative of this function with respect to x’, given by 


is not an absolutely continuous function of y’ for a given x’, as must be the 
case, by (57), if F(x’, y’) is to be an absolutely continuous function of two variables. 
Notice that, whatever the choice of Cartesian coordinates, the function con- 
structed is absolutely continuous with respect to each variable for all values 
of the other variable. A theory of absolutely continuous functions of any 
number of variables can be built up on the same lines as above. 

A more general definition of partial derivatives will be given in the next 
chapter, applicable to a wider class than absolutely continuous functions of 
several variables. 


78. Supplementary propositions. In this section and the next we 
shall introduce some new concepts and prove supplementary pro- 
positions needed for the proof of the fundamental theorem of [73] 
and for further generalization of the concept of integral. 

We take functions 9(#@), completely additive on the family C 
consisting of some set &, of Lg and of all the sets of Lg forming 
part of J, G(®,) being assumed finite and non-zero. We write V, 
for the set of all such functions. If 9,(@) and 9%) € V,, then 
C19,(F) + C.9,(F) also € V,. As we know, for any o(@) of Vy, the 
sums 


ty(~) = | 7 (S,)| (58) 
k 


remain bounded for any subdivision of 2, into a finite number of 
sets J, [72]. We write || ¢ ||, for the strict upper bound of the sums 
ts, i.e. the total variation of g(%) on &,. We obviously have || ¢ ||, = 
=|c]|-|lo |], where c is a constant. If the subdivision 6’ is a con- 
tinuation of the subdivision 5, we write 6’ > 6. On observing that, 
given any subdivision e =e’ +e”, we have | ¢(e)| < | y(e’) | + 
+ | p(e”) |, we can say that t,.(p) > t.(¢~) if 6’ > 6. If 6, is a sequence 
of subdivisions such that t; (py) — || ||, and 6; > 6,, then all the 
more fy.(p) —> || p |[1- Tf ts,(p) > {1 @ Il %,(~) > Iv th, and te(e + p> 
— || : + y||,, then we obtain on putting 67’ = 6,676): 


on (P) > Ne llas tery) lvls tae (@ + v) [le + oll 
If ae inn are Subsets in 6,”, the inequality 
|e (Bin) + (Bin) | < |p (BKa) | + lv Fen) | 
gives us, on summation and passage to the limit: 
he+vlh <ieih+iylh- (59) 
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For functions 9(%), satisfying the condition 


¢(&)=0 for G(%)=0, (60) 
we consider, in addition to sum (58), the sum 
= P” (Ex) 
S, (9) =" ” G (&) * (61) 


If G(f,) = 0, the corresponding term has the form 0/0, and we 
regard it as zero. We could dispense with this proviso if we agreed 
to consider only those 6 for which all the G(@,) # 0, which amounts 
to associating all the 2; for which G(%,) = 0 with a different subset. 
The set of values of S;(p) is not necessarily hounded. Let us write 
V, for the set of the functions ¢(@) which satisfy condition (60) and 
for which the set S;(~) is bounded. The set V, forms part of V,. 
It is easily seen that, if gf) € V. and y(%) € Vz, then cy(F%) € V, 
and 9(f) + y(%) € V,. The strict upper bound of the sums S,(¢) 
will be written as || ¢ ||,. Let us show that, if 6’ > 6, then S,(y) > 
> 9,(p). We only need to show that, if f = &’ + &” is a subdivision 
of 2, then 

2 (8 28” 2(8 
sie + Sieh BE 2 


This inequality is equivalent to the following: 

G (8) G(&") gw (B’) + G (B) G (F") pw (S") — G (S") G (F") # (F) > 0, 
which can be rewritten, since G(f) = G(%’) + G(F") and o(%) = 
= o(f’) + o(&”), in the form 

[G (") g (S") —G (2) go (F")}? > 0. 
As above, if 3; (~) > || ¢ |lz and 6, > 6,, then S,.(~) > |] ¥ Ile. 


We now establish an inequality between || ||, and || ¢ ||, for 
functions of V,. Application of Cauchy’s inequality gives us 


= Sle @vl= Type VOR < 


G 
gp" (Ex) Y” (Ex) y 
</> gl | zeeo a =| ste Ga) 14 (Bo) » 
ie. t3(~) < /'S, (9) VG (&,). AS when ris (59), we can form 


a sequence of ae 6, such that ¢; (yp) > || ¢ ||, and S3(p) > 
— || @ ||, and the inequality t (¢) < /8,,@ ¢) VG(@,) gives in the 
limit: 


Ile lla < Vile ll, VE (@o) (63) 
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One family of functions ¢(%) must be mentioned. A function 
g(@) is said to be essentially bounded if there exists a constant C 
such that, given any % of Lg belonging to &y, we have 


|e (&)| < CE (F), (64) 


and the set of all essentially bounded functions (with different 
values of C) will be denoted by V,. It follows at once from (61) that, 
if o(%) € Vy, then Sp) < C2 G(%,) for any 6, ie. V, forms part 
of V,. 

We have already discussed piecewise constant point functions 
[46]. We now introduce piecewise constant set functions. A »(Z%) 
satisfying (60) is said to be piecewise constant in 2, if there exists 
a subdivision of 2, into a finite number of sets %;, such that 

y(&) = aes G(B), if &C&, and G(B,) #0; 
y(@)=0, if Fc&, and G(s%,) =0. 


(65) 


If the ratio »p(%;,)/G(F;,) is denoted by a; (we take a, = 0 if G(F,) = 
= 0), the piecewise constant y(@) in question can obviously be 
written as the integral 


v(B) = { Sa, os, (P) G (dB), (66) 


€ Kk 


where w,;,(P) is the characteristic function of the set &,. We have 
under the integral sign a piecewise constant point function, equal 
to a, on the set %,. Conversely, every integral of the type indi- 
cated gives a piecewise constant set function y(%). 

Given any subdivision 6 of the set #, into a finite number of 
sets Z,, we can associate piecewise constant functions ,(%) and 
js(P) with any set function 9(@), completely additive on %, and 
satisfying (60), and with any point function /(P), measurable and 
summable on %,. We define 9(@) for & belonging to %, by the 
formula 

Po(é) _ P(E) (67) 


G(é) G (Ex) ’ 


ie. for any % of 2, by the formula 


es é 
9 (8) == a, @:,(P)@(d&), where a, = ae (68) 
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We put /,(P) equal to a constant: 
1 
flP) = aay J HP) Gas), (69) 


if P € &,. 1£ G(F,) = 0, (69) is taken equal to zero. If y(&) is expressed 
by the integral: 
= § HP) (dé), (70) 
then obviously: 
= J hiP) a8). (71) 
It follows at once from the definitions of 9(@) and f,(P) that: 
(c, gf) + Cg yWZ)), = ¢) ¥(F) + Cy y(F); 
(c, f(P) =F c, F(P)), =¢,f,(P)+¢, F,(P). 
Let us show that, if 6’ > 6, then 
(PZ ))y = valB) . (73) 


With 6’, we have a subdivision of each @, into sets 3}, where’ 
by (67): 


(72) 


PalSi,s) _ (Ee) 
GE) OE)’ 


ay = 


whence it follows that 
(P(F))y = J Zeer ne s(P) G(d&) )= Ja Oy ad a" p,o( P )G(dz) = 
= J 2a, on (P) HE) = 98) . 


If c is the greatest of the a,, by (65), | y(%) | < cG(@), ie. every 
piecewise constant set function is essentially bounded. Notice 
that we are considering piecewise constant functions only with 
subdivisions of %, into a finite number of sets. 

Let 6° > 6 and &;,, be the above-mentioned subsets. It follows 
from the definition of »,(%) that 


GAEx,s) _ Fale). - 
GExs) GE) ’ PF) = (Fx) 5 


so that 
ais) s) Palex) ey 
=25 i, $ ee G(Ex) Pa(F k, s) ae 
aa ae (Ex) 
= a Gey PAF) = GES 
i.e. 


Sy(%a) = S,(¢) - (74) 
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If Ss{vs)—> || Ys lle, then all the more 8,.(¢5)— || Yo |l2 when 
6, = 6,6. But 6; > 6 and, by (74), Sy.(p~s) = S(p) and does not 
depend on n, whence it follows that 


| Palle = S,(9) . (75) 


If y(Z) € V2, there exists a sequence of subdivisions 6” such that 
S8")(¢) > || @ ||,, and hence there exists a sequence of subdivisions 
such that 

Il Poe [lp > IP [le - (76) 


The quantity (75) is easily expressed by an integral. Let 9,(P) 
be the integrand in integral (66) for the piecewise constant function 
pa(S): 

g,(P) = a = 9(F,): G(F;), if PECG,,. 


We have 
PE x) = PE x) = OB) = HF) J G3 P) OB), 
so that 
_ P(Ex) 2 
i.e. 


SAP) = || Palla = J oP) (G8), (77) 


and, on using the usual notation for the norm in L,: 
Il Fa lle = I goP) lle - (78) 


Let us bring in two further formulae, required later. By (73), we 
have [ 9(f) — s(&) lx = 9s(@) — go(@) for 6’ > 6. The difference 
gs(P) —gs(P), which is constant at points of the subsets 2}, of 
the subdivision 6’, is the integrand in formula (66) for (2) — 
— 9(@), i.e. 


py(B) — 98) = J [ge(P) — a P)1 AGB), 
and we can write, by (78): 


| (P — Pada tle = {] Pa — Palle = ll GaP) — gal P) Il, - (79) 


The second formula concerns the function g;(P). It has the con- 
stant value aj, = 9(%{,s)/G(%;,s) at points of Z},,and, by the definition 
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of f,(P), the function (g,(P)), takes at points of %, the constant 
value 


1 P(Ex, s) P(Ex, s) _ (Ex) 
G(E,) 2. 2, GE, 9) G(de) = wer ey = Gey AE, s) G(Fx,s) = G(&,) ’ 


(gu(P))a = 9,(P). (80) 


79. Supplementary propositions (continued), A new concept must 
be introduced, important for later constructions. Let g(%) and 
w(F)eV, and =&’+ 24" be a subdivision of F into two dis- 
joint subsets. We write inf [y, y] for the strict lower bound of the 
sums 9(%’) + y(%") for all possible subdivisions of @: 


inf{y,y] = inf [9(%') + o(F")] = o(8), (81) 
. F=F’+e" 
1,é. 
1B’) + wlE") > a8), (82) 


and, given any positive «, there exists a subdivision = 7’ + 2” 
such that 
GF") + y(F") < o(F) +. (83) 


For any 2 of &,, the function w(%) has a finite value, since ¢(%’) 
and y(@") are bounded. If we take @ itself as 7’ and the empty set 
as 2", or vice versa, we get 


oF) <9(%); oF) < y(Z). (84) 


Let us show that w(#) is completely additive. Given a subdivision 
of & into a finite or infinite number of (disjoint) sets F,, we have 


= 2 HF): p(B) = 2 (Fy), 


these series being absolutely convergent. By (84), the positive w(@;) 
form a convergent series, since the series with terms ¢(%;,) and p(%,) 
are absolutely convergent, so that the whole of the series formed 
from the w(%,) has a definite sum, independent of the order of the 
terms [I; 134]. Given e > 0, there exists a subdivision J = %’ + &’, 
such that 


o(F') + y(F") > wo(F) and 9(%') + p(S") < o(%) +e. (85) 
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We write 2; = 8,0’ and 2, = %,8", so that 
+8 =6, DSH,=F'; DSe=s". 
k k 
On using the definition of w(%), we can write o(F;) + y(F,) > 


> w(%,), and we obtain on summing over &, since g(%) and y(Z%) 
are completely additive: 


= (8 ,) < 9B) + vB"), 
and the second of inequalities (85) gives 
2 (8) < o(%) +e, 
whence, since « is arbitrary, we have 


= o(F,) < o(F). (86) 


We now prove the reverse inequality. We take a subdivision 7, = 
= 6% + Fx» such that 


PB 1,1) + VF p,2) < (8) + (87) 


On summing over k, and writing %, for the sum of the %,,, and 
@, for the sum of the %,., we get 


o(F,) + WF.) < a (Fy) +e, (88) 


where ,7,=0 and ,+ %,=@. But 9,(@1) + y(F%,) > o(F), by 
the definition of w(%), and inequality (88) gives 


w(B) < 2 o(F,) +&, 


and, since « is arbitrary, we get the reverse of inequality (86), whence 
it follows that w(%) is completely additive: 


o(B) = 2 08). 


It follows from what has been said that the w(#) defined by (81) 
belongs to V,. 
We introduce the further notation for any function 9(@) of V,: 


gn) = inf [p, n@]. (89) 
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On observing that G(%) > 0, we can say that gni,(2) > gn(Z), 
and in addition, by (84), on() < 9(@). Thus, given any % belonging 
to &5, the sequence ¢,(@) has a finite limit as n—> co. Let us also 
recall the definition of absolute continuity: 

g(Z) is said to be absolutely continuous on @, if, given any « > 0, 
there exists an 7 > 0 such that | 9(%7)| < « if € c &y, F belongs 
to Lg and G(@) < 7. 

Lemna 1. If »(@) is absolutely continuous on & y, then o,(f) —> o(F) 
for any @. 

Given « > 0, we have for some subdivision = 3; + @h, by (83): 


Fn) + 2GFn) < PAF) +e < 9(F) +e. (90) 


But, by Theorem 1 of [72], g(@,) > 2, where J is a definite number, 
and it follows from (90) that 


QB") < wey pens : (91) 


where the right-hand side is < 7 for all sufficiently large n, whence, 
in view of the absolute continuity of o(%), we have | g(%?)| < « 
for sufficiently large n. The first of inequalities (90) gives o(%,;) < 
< on(F) + &. But (Fn) = o(%) — (Fp), 80 that o(F) < ga(%) + 
+++ 9(%,), and, since | g(F;)| < e, we have 9(%) — on(Z) < 2e, 
whence it follows, since « is arbitrary, that 9,(%) > ¢(@). 

LemMA 2. For a non-negative completely additive function g(@), 
the limit of v,(2) is a completely additive function, completely continuous 
on & y. 

We use the notation lim 9,(%) = vp (#). On observing that the 


Navoo 
@n(S) are completely additive and non-negative, and using the 
lemma of [63], we can say that g(%) is completely additive. It 
follows from (84) that 0 < 9,(@) < nG(&), so that each of the »,(&%) 
is absolutely continuous. 

Further, since 9? (%, — £) > oF, — &), where & c &,, it 
follows that og (#,) — 9 (£) > gn(Fo) — onlZ), ie. 


0 < GOS) — OF) < "UF ) — Pn(Fo) . (92) 


Hence it is clear that the ¢,(%@) tend to »?(&) uniformly with 
respect to the whole of & of #4. Since the g,(%) are absolutely cont- 
inuous, we can say that 9 (%) is also absolutely continuous on & 4. 
For, given an ¢ > 0, we can fix an n =n, auch that »(#,) — 
— GnlF 9) < e/2. Now, by (92), we have o(%) < gn(&) + 4/2. 
In view of the absolute continuity of g,,(%), there exists a positive 
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n, such that gn,(f) < e/2 if F c F, and G(#) < 7», and it follows 
from the inequality written above that g@(%) < « if € c &, and 
G(@) < ny, and the lemma is proved. 

Lemma 3. If 9(@) € V,, is non-negative and absolutely continuous 
on &o, given any « > 0, there is an essentially bounded function y(Z) 
such that 

Ile—vlh<e. (93) 

It follows from (84) that, as mentioned above, 0 < 9,(@) < nG(@), 
i.e. each of the g,(@) is essentially bounded. In addition, 9(@) — 
— 9%) > 0, so that the total variation of this difference on 2%, 
is equal to its value for & 5, i.e. || o(F) — gnl(Z) (|, = 9(F 9) — on(&p)- 
But, since g(@) is absolutely continuous, we have by Lemma 1: 
onlZ ) > (By), ie. |] (8) — pn(Z) lla 0, and wo can satisfy 
inequality (93) simply by taking as y(Z@) the function 9,(@) with some 
sufficiently large n. As a matter of fact, (93) can be satisfied by 
choosing as y(@) a function piecewise constant on 2,. We shall 
prove this as a preliminary for functions 9(%) of V,. 

THEOREM 1. If 9(%) € V2, given any « > 0, there exists a piecewise 
constant function w(f) such that 


lly - ll, <e. (94) 
Let {(P) be a measurable function of Z, on 2, and 


ne ay ea Gaz), 
¢ 


this expression being assumed equal to zero if G(Z) = 0. We have 
the obvious equation: 


SUP) — ap ads) = J f(P) GF) — a QB). 


On putting 7 = @, and a= sHEN Ct G(d@) | G(%,), where F;, are 


the subsets of some subdivision 6 of the set @,, and summing over k, 
we get 
i AP) — fa P) [120 = I ACP) WE. — | fa P) 112, - (95) 
If we take f(P) = 95(P), where 6’ > 6 and g,(P) is the function 
appearing in (77), by (80), we have f,(P) = 9,(P) and (95) gives: 
I 9a(P) — gal P) N12. = |] GaP) lle. — | GaP) lle - (96) 
On taking (78) and (79) into account, we can write 


Il (& — ee lla = I] Pa le — Il Palle- (97) 


80] FUNDAMENTAL THEOREM 241 


By (76), there exist sequences of subdivisions 4, and 6, such that 

Il (p — Go)s, Ile—> Il P — ¥o Il, and || ya; |le > Il ¢ Ile For the sequence 

On = 5n6n, we have all the more: || (¢ — 5)a ||, —> || 9 — a ||, and 

I! Paz Ile—> Il g ll On putting 6’ = 4, in (97), and passing to the 
limit, we get: 

Ile — Palle = Ile lle — {I Palle - (98) 


On again taking (76) into account, we can say that there exists 
a subdivision 6 such that the right-hand side of (98) < «, and, putting 
w(F) = 9,(F), we get (94). 

THEOREM 2. Jf g(@) € V, and is absolutely continuous in & 4, given 
any — > 0 there exists a function w(@), piecewise constant on & o, 
such that 

lg—@lh<e. (99) 


We can write » as the difference between two non-negative functions 
of V,: oF) = 9,(F) — 9%), and if the piecewise constant functions 
w,(Z) and w,(%) exist, such that || 9, — @, ||, < ¢/2 and || o— 
— @, || < e/2, we obtain by (59), on introducing the piecewise 
constant function w(%) = w,(%) + w,(&): 


le — | < [le ~ ei lh + Ile — 2 |h <e, 


and it is therefore sufficient to prove the theorem for the case of 
non-negative functions g(@). 

By Lemma 3, there exists an essentially bounded function y(Z) 
such that || p — » ||, < ¢/2. The function y(@) € V. [78], and hence 
there exists a piecewise constant function w(#) such that || yp — @ ||, < 
< €/[4G(%,)}. By (63), we have || py — w ||, < ¢/2,and we can write, 
on using (59): 


and the theorem is proved. The assertion of Theorem 1 amounts to 
the fact that the piecewise constant functions are everywhere dense 
in V. with norm || ¢ |[,, whilst that of Theorem 2 amounts to saying 
that the piecewise constant functions are everywhere dense in the 
space of absolutely continuous functions of V, with norm || ¢ ||;. 


80. Fundamental theorem. We turn to the proof of the fundamental 
theorem of [73]. As above, it is sufficient to prove the theorem for 
non-negative functions g(@) of V,. As usual, we write y@)(%) for 
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the limit of the »,(@) defined by (89). By Lemmas 1] and 2, the equation 
gp? (€) = g(%) holds when and only when 9(&) is absolutely con- 
tinuous in fy. Let g(@) be not absolutely continuous. We form the 
function 


PF) = HF) — PB), (100) 

non-negative and completely additive in 2%). 
We show that »®(#) is the singular term in (14) of [73]. We form 
Pee ne OO nas ant Ae) ee (Oi) nals <0) 
We recall the subdivision = ), + @}, satisfying condition (90), 


and inequality (91), by virtue of which G(Zj)—> 0 as noo. On 
taking definition (101) into account, we can write 


OZ) < (Fp) + nA(Fh) — PUFA) = 
= 9(Fq) + nA Fp) — [pPO%UF) — pF). 
i.e., by (90), 
PF) < PF) + & — [POUF) — POF) - 


But ¢ (&) is absolutely continuous, so that we have 0 < go (7) < 
<e ra all sufficiently large n, i.e. 


PME) < OF) — GOS) + 2e. 
Recalling that y,(%) > 9° (&), we get iim g(%) < 2e, ie. this 


limit is zero for any @, since « is aibitieey. But g@(%) are non- 
negative and do not decrease as 7 increases, so that we have for 
any 7: 


GF) = inf [p(F) + nG(F")] =0. 
& =o’ +E" 


On applying this to , with n = 1, we can say that, given any 
e > 0, there exist sets %, belonging to %, of Lg such that 


gE) + HF. — Fp) < ar 


n 
whence, since g(&) and G(#) are non-negative, 


gE.) < rc and G(F,—&,) < a (102) 


We form the set 7) = %,+%,+ ..., belonging to #, and Lg. 
On observing that %, c %} for any n, we can write G(F, — ) < 
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< -/2” and, on letting n tend to infinity, we get G(Z, — £!) = 0. 
On the other hand, on observing that 
o,=6,+ (®,—8,) + (F, —8,—€,) + easy 
and using the first of inequalities (102) and the fact that p®(Z) is 
non-negative, we get y(%!) < e. Thus 
gp) (Fl) <e and GF, — F) = 0. 


On writing 7,—%}=@,, wo can say that, given any «> 0, 
there exists a set Z, of , such that 


p) (F, —F,) <e and G(#,) = 0. 
Let e, be positive and > 0. We can write 
9p(F, —&@.,) <e, and G(@,,) =0. 


We introduce the sett H=%, + 2,,-+... In view of the last 
inequality, G(H) = 0, and, on observing that @,, © H for any n 
we have by the first inequality: ¢°(%, — H) < én, and, on isting n 
tend to infinity, we get o°(%, — H) = 0. Thus there exists an H 
such that 

gp) (F, — H) =0 and G(H)=0. (103) 


Any =H + (é —@H), but  —FH c &, — H, and, since 
p°(&) is positive, the first of formulae (103) gives o°(% — FH) = 
= 0, so that »°(%) = o(FH). Thus there exists an H such that 


p® (£) = p (FH) and G(H) =0 


Hence 9©(#) is the singular term in (14) of [73], and on taking 
into account (100) and the absolute continuity of o“?(%), we need 
to prove the following theorem in order to prove the theorem of [73]: 

THeorEM 3. Any absolutely continuous function o(@) of V, can be 
expressed by the sae 

=4 fLP) G(dZ), (104) 


where f(P) is measurable and integrable on & y. 
By Theorem 2, there exist piecewise constant functions w,(2) 
such that 


1 
lp — eal < Saar (105) 
whence it follows, by (59), that 


l@nt1 — allt < |] — @ngalli + lle — @alls ig soy 
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But every w,(@) is the integral of a piecewise constant function 
gn(P) with a finite number of finite values on 2, [78]: 
wn (B) = f gn (P) (de) 
F 


and 
Ont (Z) — Wn (f) = i [Gn41 (P) —9n (P)] G(d@). 


The total variation of this set function, expressible by an integ- 
ral, is given by [73]: 
lOntr — nll = f nti (P) — Gn (P)| G(de), 
& 
whence 
] 
Gna (P) — Gn (P)| G(AF) < =, (106) 
2 


& 
and the series consisting of these integrals is convergent. The series 
lau (P)| + [ga (P) — 91 (P) + lga(P) — g2(P)+--- (107) 


is now convergent almost everywhere in @, [54], and all the more, 
the series 


AP) =9,(P) + [92 (P) — 91 (P)] + [9s (P) — 92 (P)] + --- 


is convergent almost everywhere, i.e. g,(P) — f(P) almost everywhere 
in 8. The sum of series (107) is an integrable function on &, by virtue 
of (106). But | f(P) | < this sum, so that /(P) is also integrable. 
We can write 


F(P) = 9n(P) + [9n41 (P) — gn (P)] + [9nt2(P) — Gn41 (P)] = 


and, on taking (106) into account, we obtain for any @ of @,: 


ie — 9n(P)| G(dé) <atsarte= 1 
¢ 


gn gn-1 4 


whence it follows at once that 


lim @,(&) = lim { g,(P) G(d&) = § f(P) G(de) 
N—-poo N—-o0 ¢ 


€ 


for any &. On the other hand, it follows from the definition of the 
total variation over 2, and (105) that 


1 
\p(Z) — @, (2)| < ll a ry |l1 < ont. ? 
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whence w,(2) > o(&) for any @, i.e. 
oF) = § \P) G (a8), 
Fs 


and the theorem is proved. We have so far assumed that G(Z,) is a 
finite number. If G(#,) = +o, the result is obtained by a passage 
to the limit from the sets %, with a finite value of G(Z%,), for which 
the theorem is proved, f(P) being independent of n. 


81. Hellinger’s integrals. Let us investigate in more detail the 
family V,. We note first of all that, in view of condition (60), if 
gy(F) € V,, it is absolutely continuous. We have (98) 


le — Palle = llelle — [IPalle (108) 
and, in addition [78]: 
4 (F) = ( 9,(P) G(d%) and |iy,||, = { 93 (P) @(dé). (109) 
Fi &, 


By (76), we can choose a sequence of subdivisions of &, so as to 


have 
1 
lp — Palle = ll¢ll2 — llealle < wae)” (110) 


We now have, by (62): || p — 9, ||, < 1/2"**, and it follows from 
the proof of the theorem of the previous section that g,(P)— /(P) 
almost everywhere on %, and 


p(B) = § (PGs). (111) 
¢ 


It follows from (100) and (110) that now: 
J 9%, (P) G(d&) < |lpll, and ( 93, (P) G(d&) — |lella. 
a Eo 


On using Theorem 4 of [54], we can write 


JP (P) G(d&) < |lella, (112) 
&o 


whence it follows that /(P) € Z, on @o. We shall now prove the 
reverse inequality to (112). 
On applying Buniakowski’s inequality to (111), we can write 
p? (Fy) < { Pe (P) G(d&) [ G(dF) = G(B,) { f? (P) (dé). 
ok & 


ee 
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On dividing by G(%,) and summing over k, we get 


8,(9) = BEX < [p(y aaa), (113) 
k f, 


so that we have for the strict upper bound of the sums written: 


lI¢lla < { P(P) @ (dB). 


Comparison with (112) gives us 


lle lla = § f (P) G(ag). (114) 


We have shown that, for every ¢(@) of V,, the function {(P) appear- 
ing in (111) belongs to Z,, and that (114) holds. Conversely, if we 
know that g(@) can be expressed by (111), where f(P) € LZ, on &,, 
then it follows from (113), deduced solely on the basis of (111), that 
g(@) € Vy. Since the representation by (111) is unique, we can say 
that (114) in fact holds. We thus arrive at the following important 
theorem: 

THEOREM 1. The necessary and sufficient condition for y(@) to belong 
to V, on &, is that y(@) be expressible by (111), where f(P) € L, on & 4. 
If this condition is satisfied, (114) holds. 

Let us indicate another necessary and sufficient condition for 
g(f) to belong to V.. 

THEOREM 2. The necessary and sufficient condition for y(@) to belong 
to V, on & , ts that there exists a non-negative function H(@), completely 
additive on @, such that 


v (F) < GF) HE). (115) 


For, if this condition is fulfilled, the sums S,(y~) are bounded: 


8,(9) = > neh < SHE) = HC). 


Conversely, if o(&) € V,, (111) holds and f(P) € Z, on #, and 
hence on any subset of 8, measurable with respect to G(#). We put 


H(#) = § f° (P) G(a@). (116) 
e 


On applying Buniakowski’s inequality to (111), we get (115), and 
the theorem is thus proved. 
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If o(%) € V, on &4, the strict upper bound of the sums S,(¢) is 
called a Hellinger integral and is denoted by the following symbol: 


9 (dg 
lela = sup Sy (p) = ar (117) 


Formula (114) now gives the transformation of a Hellinger to a 
Lebesgue integral: 


2 
[ER = reds). are) 
fo ey 

If the subdivision 6’ is a continuation of 6, and 7 is integral (117), 
i.e. the strict upper bound of the sums S;(g¢), as we know, S,(y) < 
< Ss(y) < t. On taking this into account, we can say that integral 
(117) has the following property: given any e > 0, there exists a 
subdivision 6, such that, for any continuation of it, 


|i — Sy (vy) | <€(6’ is a continuation of d,). (119) 


Let us show that there can only be one 7 with this property. 
Suppose 2’ is a further number with the property. In addition to 
(119), we shall have | 7’ — 8;(y) | < «, where 6; plays the role of 
6, in (119). On taking the subdivision 67 = 6,6;, we can write the 
two inequalities: 


|i — S,(p)| <e and |i’ — 8, (y) <¢; (6 is a continuation of 07). 
Since 7’ ~— 7 = (t’ — S3(¢)) + (Ss(v)— @), we obtain for these 6: 
A <|f —8,(9)] + 6 —5,(9)| <2 
whence, since « is arbitrary, it follows that i’ = 7%. Let 9(@) and 


¢(%) € V,; we consider the sum 


Ex) Pr (E. 
Sy (P. Py) = see os (120) 
k 


which can obviously be written as 


_ il (p(x) + H1 (Fx) FP 
S(g, 3) —~"?2 = x G(E;) x 


1 g (Ex) 1 Pi (Ex) 
2 Ge) FS Ge)’ ae) 
1.0. 


] 
Sy(P,91) = > Sy (p + P1) — +8, (y) — ~ Sa (1). 
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We have property (119) for each of the sums on the right-hand 
side, where the 6 can be taken as the same, since different 5 can 
be replaced by their product. We therefore have property (119) for 
the sums 8,(¢, y,); the corresponding 7 for sums (120) is written as 


P( ia) 9(dé) , (de) 


i= ~ dé) (122) 
On taking ae into sea we can write 
(dé) p (dé) __ (p(dé) +9, (dé)]? *(dé) 1s ff oe (dé) 
Gide) y ? G(dé) | “aas) 7 J aaa) ; 
or, on taking (118) into account: 
ee = re f(P) fy (P) (a8), (123) 
where /,(P) is the point oe of Z, corresponding to 9,(%): 
PCT Jie Ode): (124) 
We can investigate in the same way more general sums of the form 
= FuPe) ee (125) 


where u(P) is bounded and measurable with respect to G(@), and 
P,, is any point of #,. It can be shown that there exists for these 
sums a unique number 7 with property (119), in which 9; has to 
be replaced by o,, the inequality in (119) being fulfilled for any 
choice of P,. This number 7 can be expressed by a Lebesgue-Stieltjes 
integral: 
i= [ w(P) f(P) f, (P) G(a@). (126) 
a 
Property (119) is at the basis of the general definition of the integral. 
The next sections give a more detailed treatment of the case of a 
single variable. 


82. The case of a single variable. We shall investigate this case by 
starting from a point function and considering the simplest 
case of continuous functions. The following notation will be used 
for brevity: if 4 is an interval [a, 8], the symbol Ar(x) will denote 
the difference 7(f8) — 1(a). Let g(x) be a non-decreasing function 
continuous .on the finite interval [a,b], and F(x) a real function, 
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continuous on this interval, with the property that 4 F(x) = 0 if 
A g(x) = 0. Let 6 be a subdivision of [a, b] into a finite number of 
sub-intervals 4; and 


(A, F)? 
127 
= ag aa (127) 


The terms of the form 0/0 are assumed zero. This sum does not 
decrease on adding new points of subdivision [78]. Since we shall 
be making use of a division into sub-intervals, we need to prove the 
theorems analogous to those that we had for a division into sets 
measurable with respect to G(@). 

THEOREM 1. The necessary and sufficient condition for the set of 
values of the sums S, to be bounded is that there exists a non-decreasing 
function h(x), bounded on [a,b], such that 


(A F)2 < Ag-Ah, (128) 


for any sub-interval of [a, b]. 

If this condition is satisfied, the terms of S, do not exceed 4, h 
and we have 8; < A(b) — h(a) for any subdivision. 

We now prove the necessity of (128). If the 9; are bounded for 
the whole of [a, 6], they are all the more bounded for any part of 
[a, b]. Let h(x) denote the strict upper bound of 8, for the interval 
[a, x]: 

h(x) = sup S, for the interval [a, 7}. 


It can be shown, along the same lines as when proving that the 
total variation is completely additive [8], that the strict upper bound 
of S, for any sub-interval [a, 8] is equal to h(f) — h(a) = Ah, 80 
that h(x) is a non-decreasing and bounded function. If we do not 
divide A into sub-intervals, the sum J, for A reduces to the single 
term (4 F)?/A g, and it is less than the strict upper bound 4h of 
S, for A, ie. (4 F)*/4g < Ah, which amounts to (128), 

Let G(@) be the set function on [a,b], generated by the point function 
g(x), and let F(x) have the form 


=f f(z) Gd &) + C, (129) 
where f(x) € L,. We have 
AF = { f(x) G(aB) = § f(x) dg(a), 
4 4 
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and we obtain by applying Buniakowski’s inequality: 
(AF)? < [ P(@) G(dB) { G4B) = Ag(x) ( P(x) G (dB), 
4 4 4 
i.e. condition (128) is fulfilled, and 
x 
h(x) = { f2(x) G(a8), (130) 
a 
where, since g(x) is continuous, it is unimportant whether [a, x] is 
closed or not. 
Formula (129) leads to the completely additive set function 
€ 


defined for sets that are measurable with respect to G(Z) and belong 
to [a,b], where 4F = 9(4). The sums 


y* (Ex) 
GG) (131) 


have a strict upper bound ¢, given by (118): 
b 
t= { P(x) G (dé). (132) 
a 


Let us show that sums (127), corresponding to a division of [a, b] 
into sub-intervals only, have the same upper bound. This latter can 
never be greater than 7. We can use the absolute continuity of (@) 
to show that integral (132) is in fact the strict upper bound of sums 
(127), which are obtained on splitting [a,b] into sub-intervals. 
By what has been said, given any ¢ > 0, there exists a subdivision 6 
of [a, b] into measurable sets J, (k = 1, 2, ..., n), such that 


ie (8%) é 


2 Fer > te (133) 


where the 7 denotes integral (132). In view of the necessity of the 
condition in Theorem 1 of [37], we can say that, given any 2,, there 
exists an elementary figure R,, i.e. afinite sum of non-overlapping semi- 
open intervals, such that 


Ry + ep =F, + ey (134) 
(ae ie ere 4 
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where the measures of e, and é, are as small as desired. The sets 2, 
have no points in common with each other, but R, may have common 
points on account of the ej, ie. A,, Cc exe7, and the measures of 
these common parts are also as small as desired. In each of equations 
(134) we can refer to e;, the part of 2, which is in common with the 
remaining F#,, this common part being the sum of a finite number 
of semi-open intervals. If 7 is the greatest of the measures of e, and 
e;, we shall have for the new eé;, with this transfer: G(e,) < (n + 1)7, 
since G(R,R,) < Glegei) < . We can therefore assume that the R, 
in equations (134) have no points in common with each other, and 
the measures of e, and é; are a8 small as desired. 

If we take into account (133) and the absolute continuity of ¢(Z), 
and take the measures of e¢; and e; as sufficiently small, we can write 


Let A, (s = 1,2,...,m) be the sub-intervals appearing in the 
composition of all the &,. We obtain by taking account of (62): 


mgt (Ay) 
= G4) 


> i Qe. (135) 


Since g(x) and F(Z) are continuous, we can regard the A, as closed 
or open intervals. These intervals do not need to cover [a,b]. On 
adding the non-negative terms corresponding to the remaining inter- 
vals, we get (135) all the more for the complete sum, and it follows, 
since « is arbitrary, that integral (132) is the strict upper bound of 
sums (127), given condition (129), where f(x) € Z,. A further point: 
since g(x) is assumed continuous, the A(x) defined by (130) is con- 
tinuous. 

We now show that, if condition (128) is satisfied, i.e. sums (127) 
are bounded, F(x) is expressible by (129), where f(x) € L,. Let A be 
a sub-interval of [a,b] and 4j the remaining sub-intervals of some 
division of it. It follows from (128), by Buniakowski’s inequality: 


(> 14, FIP <( SV4AV AG? < > 4h S 4g, (136) 
k k k k 
1.8. 


(z= lai < Ag: Ah. 


The same inequality holds for the strict upper bound of the sums on 
the left with any subdivisions, so that F(z) is a function of bounded 
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variation and we have for its total variation »(2): 
(Av(x))? < Ag Ah. (137) 


If A, are any desired non-overlapping intervals of [a, b], it follows 
from (136) that 


(S14 FIP < 3 Ai gth() — hay]. 
k k 
If the sum on the right is < «, we have 
= |4;, Fl < Ve V(b) — h(a), 


and it follows from this, since « is arbitrary, that F(x) is absolutely 
continuous with respect to g(x), i.e. 


= { fw) dg (x) +0. (138) 


Tt remains to show that f(x) € L,. We form the bounded function: 


| n, if f(x) > n, 


ata ed ee a ee es 


and put 
x 
= f fr (2) dg (x), (139) 
a 
The functions f,(z) € Z,, and hence, by what has been proved: 


A 
sup Sh are ={ f2 (2) dg (2). (140) 
a 
If integral (138) is taken over different sets @ of [a,b], which are 
measurable with respect to g(x), we get a set function whose total 
variation on [@, 2] is given by the integral [73]: 


= ( Ifa)|dg(a). (141) 


If we split [a, 2] into intervals only, we get the same total variation 
for function (138) [74]. 
On taking (137) into account, we have 


A 3 
cup > Con =, (142) 
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where J is a finite number. On the other hand, by (139) and (141), 
we have | A, Fr | < A, v, and, on taking (140) and (142) into account, 
we obtain for any n: 

b 

{ A(x) dg(x) <M, 

a 
whence it follows that f(z) € L,. The above discussion yields the follow- 
ing theorem: 

THEOREM 2. Condition (128) is equivalent to the fact that F(x) ts ex- 
pressible in the form (138), where f(x) € L,, and if (128) is satisfied, the 
strict upper bound of sums (127) is given by the integral (132). 

Throughout the above, we do not need to assume that g(x) and F(z) 
are continuous. Without this assumption, we take the basic interval 
as semi-open and split it into semi-open intervals for which 4g = 
= 9(8 + 0) — g(a + 0). All our results are retained, apart from the 
continuity of h(x). It may be mentioned that, as a consequence of 
condition (128) and the continuity of g(x), we can write (128) with 
continuous f(z). 


$3. Properties of the Hellinger integral. The strict upper bound of 
sums (127) is the Hellinger integral, for which a similar notation to 
(117) is used: 


b 
[aF(a)}* 
ome (143) 


and where we have the formula: 


dF 
fog fp ean as 


Let us show that this integral is simply the limit of sums (127) on 
indefinite refinement of the sub-intervals 4,,. 

THEOREM 3. Jf F(x) satisfies condition (128) and F(x) the analogous 
condition 


(AF)? < Ag Ah,, (145) 

(g(x) is continuous), sum (127) and the sum 
, Ay FA, F, ag 
Pee Ang ( ) 


have a definite limit on indefinite refinement of the A,, the limit of (127) 
being equal to the strict upper bound of these sums, i.e. integral (143). 
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As above, let i be the strict upper bound of sums (127). Given 
e > 0, there exists a fixed subdivision 6, such that S;, > i — e. Let 
6 be so fine a subdivision that every sub-interval of 6 contains not 
more than one point of subdivision of 65, and that the increment of 
the continuous function A(z) in each sub-interval of 6 is not greater 
than e«. We have for the subdivision 66): 


If p is the number of points of subdivision in 6), not more than p 
sub-intervals of 6, are split into two on passing to 66,,,and the correspond- 
ing non-negative term of S; is replaced by two non-negative terms of 
S53, Each of these three terms is not greater than e, by what has been 
said regarding the increment of A(z) and property (128), so that 


0< Sy, — 35 < 2pe. 


On comparing with (147), we get S83 > 4 — (2p + 1)e, whence it 
follows, since « is arbitrary, that the sum (127) tends to 7 on indefinite 
refinement of the 4,. As regards sums (146), we notice that F(z) + 
+ F(z) is expressible by a formula of the type (138), where f(z) + 
+ f(x) € Z,, and the sums 

> (4a (F 4 POP 


‘e Arg 


like the analogous sums for F’,(x), have a limit on indefinite refinement 
of the A,. It follows from this that the sum 


Soe Sa ee _ 
” Ang 2< 4n9 
-ty 4m = (Ay P 
Ang 2 — 


also has a limit. It coincides with integral (122). We thus obtain the 
following Hellinger integrals: 


b b 
(QF)? _,. (4,.F). fdFdF, _ ,. Ay F + Ay P, 
lege eae de ee ag 


By what was said in [81]: 


° AP EL 


=| Hee) fa (@) dg (a 
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where 
x 
= f fy (a) dg (2). (149) 
We can consider the more general sums 
(Ay F)? 
su ‘5 (150) 
and 

A, FA, Fy ay 
Er) Ang ’ ( 5 ) 


where u(x) is continuous in [a, b], integrals (148) exist and é, is any 
point of 4,. These sums also have a definite limit on indefinite re- 
finement of the 4,. It is sufficient to prove this for sums (150). We take 
the sums 


(Ag F)? 
= My Aa (152) 
where m, is the least value of the continuous function u(z) on the closed 
interval 4,. It can be shown, precisely as above, that the sums do not 
decrease on adding new points of subdivision, that they are bounded 
and have a definite limit on indefinite refinement of the 4,. By the 
uniform continuity of u(x) and condition (128), the difference between 
sums (150) and (152) tends to zero on indefinite refinement of the 4; so 
that sums (150) also have a definite limit. We thus obtain the following 
Hellinger integrals: 


(153) 


The theory given above can be extended to the case when g(z) is 
discontinuous. The sums indicated have a definite limit for a sequence 
of subdivisions 6,, regular in the sense of the general Stieltjes integral. 

All the above theory obviously still holds when F(z), F(x) and u(x) 
are complex functions, where F(x) and F(z) must satisfy the con- 
ditions 

|AF| < Ag Ah; |AF,| < Ag AAy. 

The squares must be everywhere replaced by the square of the 

modulus, i.e. (4, F)? by | dy F /?. 
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Some further simple properties of the Hellinger integral may be 
noticed. Let ®(z) satisfy condition (128) and be non-decreasing. We 
form the function 


F(x) = f (a) d® (x), (154) 


where u(x) is continuous, and we consider the sums 
(Ag F)? 
= Ag (155) 
We apply the mean value theorem: 
Ay Hs f u(x) d® (x) = u(é,) A, ® 
dy 


where &, € A;,. Sum (155) can be rewritten as 


(Ay F)? (Ag ®)? 
a Ag Ang =e (4) = Beg" 


and we get in the limit: 


b b 
(aF)? (d®)3 
dg =|u (2) dg ~ 


a a 


Similarly, if we have along with (154): 
= ye) 


a 
where ©,(x) satisfies condition (128) and is non-decreasing, and if 
u,(x) is continuous, we obtain 


pe = fu a (156) 


If D(x) and io satisfy condition (128), but are not monotonic, we 
also get (156), by using the canonical form for ®(z) and ®,(zx) as a 
difference between two non-decreasing functions. Notice that F(z) and 
F(x) obviously also satisfy condition (128). 


CHAPTER IV 


METRIC AND NORMED SPACES 


84, Metric space, The first part of this chapter will be devoted to 
the theory of certain abstract spaces, and will be followed by the 
application of this theory to various concrete spaces — primarily, to 
function spaces, i.e, sets of functions of a definite class. The same 
abstract space may take several different concrete forms, so that it is 
expedient to discuss abstract spaces. 

Every abstract space is a non-empty set of elements, which is 
subject to certain axioms. The nature of the elements is not defined, 
and the theory of any abstract space is a consequence of the axioms 
which define the space. For the sake of a connected exposition, we 
shall first give the full theory of abstract spaces, then later describe 
the application of this theory to the various concrete forms of the 
spaces. Let us start with the theory of so-called metric spaces. 

A set X of elements, which we shall denote by successive letters of 
the Roman alphabet (2, y, z etc.) is called a metric space if each pair 
of its elements x, y is associated with a non-negative number (2, y) 
(the distance between z and y) such that the following conditions hold: 

1. o(x, y) > 0, where the = sign holds when and only when z = y, 
ie. x and y are the same element. 

2. o(y, z) = o(z, y) — the axiom of symmetry; (1) 

3. e(z, 2) < o(z, y) + ofy, z) — the triangle axiom. (2) 

These conditions must hold for any elements z, y,2 € X. If y,, y,, 
-.+)Ym are any elements of X, we obtain by repeated application 
of (2): 


OY» Ym) < O(Yr» Yo) + O(Ya. Y3) +--+ + O(Ym—1 Ym) - (24) 


Let 2, (n = 1, 2, ...) be an infinite sequence of elements and let 
an element 2, exist such that 0(% , %,) > 0 a8 n—» 0. We say that 
Z, is the limit of the sequence z,, and write 2, => 2) or lim 2, = 2». 
It may easily be seen that a sequence cannot have more than one limit. 
For, let 2, => %) and 2%, => Y,. It has to be shown that x) = yy. We 
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have by (2): 0(% 9; Yo) < 0(%o, In) + O(%n, Yo). On indefinite increase 
of n the right-hand side tends to zero, and we obtain in the limit 
0(Xp, Yo) < 0. But (x, y,) > 0, and it follows from these two in- 
equalities that o(%», Yo) = 0, ie. = Yo. If t=» 2, then ob- 
viously every infinite subsequence Zp, => Zo. 

Let us show that o(z, y) is a continuous function of x and y, i.e. if 
In => Ly and Yn=> Yo, then en, Yn) > E(Ly, Yo). 

We can write by (2,): 


© (Las Yn) < O (Lys Xp) + O (Xo, Yo) +O (Yor Yn) ; 


Q(X, Yo) < @ (Xo, Lp) + O (Fas Yn) + O (Yn Yo); 
whence 


Q (%n; Yn) =e (Xp, Yo) < Q (Lp Xp) + @ (Yo: Yn) ; 


0 (Xs Yo) — @ (Xns Yn) < O (Xo, Zp) + O(Yns Yo) s 
io. 
|@ (Xo. Yo) — O(®a» Yn) | <0 (%y, Zp) + @ (Yor Yn)- 


As n-» ©°, the right-hand side tends to zero, whence it follows that 
(Ln; Yn) —> (Zp, Yo): 

If a sequence x, has a limit (z%,=> @), given any ¢ > 0 there 
exists an N such that 


(Lm, %,)<e for mandn>N. (3) 


This follows at once from 6(%m, Yn) < 0(Ln, Xp) + E(%o, Xn), the 
right-hand side of which tends to zero as m and n-» o. But it does 
not follow from (3), on the basis of our axioms, that the sequence Zp 
has a limit (Cauchy’s test for the existence of a limit is not sufficient). 
If we introduce the supplementary requirement that the existence of 
a limit of the sequence 2, follows from (3), the metric space in question 
is called a complete metric space. 

Let U be a set of elements of the metric space. It is said to be 
bounded if an element 2, and a positive number A exist such that 
(2%, 2) < A for all x of U. Let x, be any fixed element different from 
Ly. We have: 0(2, 2) < e(X,, Zo) + e(Ly, Z) and we obtain for elements 
of U: o0(x,, 2) < e(a,, 2) + A, where A is a positive number. Thus 
the choice of element 2, is of no importance in defining a bounded 
set U. It is easily shown that, if a sequence 2, has a limit, the set of 
elements z, is bounded. 
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We describe z, as the limiting element of a sect U of elements of X 
if there exists a sequence x, of elements of U such that z,=> Zo. 
A set U containing all its limiting elements is said to be closed. If U is 
not closed, and we associate with it all its limiting elements, the new 
set, which we write as U, is closed [31]. The passage from U to U is 
described as the closure of U. If U is closed, then U = U. If U is the 
empty set (contains no elements), 7 must also be reckoned empty. 
The set of elements satisfying the condition o(z,, x) < R, where 2, is 
a fixed element and & is a positive number, is called an open sphere 
with centre x, and radius #, whilst the sphere is closed in the case of 
O(%, t) < RK. 

It is easily shown, from the fact that 0(2, x) is a continuous function 
of x, that a closed sphere is a closed set in the sense of the above 
definition. 

Notice that any non-empty set U of X is also a metric space, if the 
same definition of (x, y) as for the whole of X is retained for the 
elements of U. The space X may in fact consist of a finite number of 
elements. 

Let X and X’ be two metric spaces, and suppose a one-to-one cor- 
respondence can be established between their elements such that 
o(2, ¥y) = o(z’, y’), where x and 2’, y and y’ are any corresponding 
elements of X and X’. In this case X and X’ are called isometric. 
It is meaningless to differentiate between isometric spaces, from the 
point of view of abstract theory. 


85. The completion of a metric space. A sequence 2, of elements of 
of z is described as fundamental (or mutually convergent), if condition 
(3) is fulfilled for it. If X is not a complete space, not every fundamental 
sequence has a limit. Let us show that, in this case, new elements 
(sometimes termed ‘‘ideal elements’’) can be associated with the space, 
with a corresponding extension of the concept of distance, in such a 
way that the space obtained is complete. 

We shall start by proving a lemma. 

Lemna. If z, and y, are two fundamental sequences, the numerical 
sequence 0(Xn, Yn) has a limit. By (2,), 0(%n, Yn) < O(Ln, Lm) + O(Lms Ym) 
+ (Ym Yn), Whence @(%,, Yn) — O(Zm, Ym) < Q(Ga, Fm) + O(Ym Yn)- 
On interchanging the subscripts m and nm and using (1), we get 0{Zm, 
Ym) — O(Lns Yn) < O(Ln, Lm) + O(Ym Yn). It follows from the last two 
inequalities that 


| @ (n> Yn) — @ (Xmas Ym) | < @ (Lys Lm) + (Ym Yn) - 
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On indefinite increase of m and n, the right-hand side tends to zero, 
so that Cauchy’s test for the existence of a limit is fulfilled for the 
numerical sequence 0(Zp, Y¥,), which is what we wanted to prove. 

Let us classify all the fundamental sequences, viz. the fundamental 
sequences x, and 2, for which 0(2,, %;,) > 0 are all put into one class. 
If z, and xp, a8 also x, and x, belong to one class, then 2, and 27 
belong to one class, since 0(2,, Z,) > 0 and @(%,, Z,) > 0 imply, by(2,), 
that 0(%,, %) > 0. The limit of o(2,, y,) for sequences z, and yp, 
belonging to different classes will be non-zero (positive). It also follows 
from (2) that, if the sequence z,, has a limit 7, in X, any other sequence 
Xn Of the same class is also convergent and has the same limit zp. 
Since the distance is continuous, sequences belonging to different 
classes cannot have the same limit. These classes of fundamental 
sequences therefore fall into two types. We shall start by describing 
classes of the first type. Let x, be an element of X. We shall have a 
class of sequences having a limit equal to 2. This class includes for 
instance the sequence z, in which all the elements are equal to 2p. 
Each 2, will have its own class of sequences. Classes of the second 
type consist of sequences z, having no limit in X. If X is complete, 
there are no classes of the second type. Suppose that X is not a com- 
plete space. We now form a new metric space X , taking as its elements 
the above-mentioned classes of fundamental sequences of X. We still 
have to introduce into X the concept of distance and show that its 
three basic properties hold. Let ¢ and y be two elements of X. We take 
any two sequences 2, and y, from the classes of sequences correspond- 
ing to them and define (2%, y) by the formula 


e(Z, ¥) = eo Q (Xn Yn) (4) 


We show that the non-negative number 0(%, y) is independent of 
the choice of 2, and y, from the classes corresponding to and ¥. Let 2, 


and y;, be any two sequences of these classes and 9’(#, y) =lim o(z4, yn). 
i- 


We have to show that (2, y) = o(%,y). By (2): o(tn, Yn) < 
< O(n; Ln) + O(La» Yn) + O(Yns Yn)- 

On observing that 0(%p,, %)— 0, o(Yn, Yn) > 0 and passing to the 
limit in the inequality written, we get o(z, y) < e’(x, y). Similarly, 
0’(z, ¥) < 0(%, ¥), 80 that o’(z, y) = o(%, y). Formula (4) thus defines 
o(z, y) uniquely. Let us now verify the three basic properties. Obviously, 
o(z, y) > 0. 
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(1) Let p(x, y) = 0, ie. 0(2n, yn) > 0. It follows from this that the 
sequences x, and y, are of the same class, i.e. z = y. 

(2). The property o(z, y) = e(y, £) follows at once from 0e(%p, Yn) = 
= (Yn, Ln)- 

(3) We choose sequences 2%, y, and 2, from the classes of sequences 
corresponding to x, y and z. We have 


Q (a, z) = lim @ (2p Zn) < 
fi oo 
< lim [@ (22, Yn) + 0 (Yn 2n)] <0 (%, 9) + O(Y 2). 
Nico 


Let the class corresponding to z be of the first type, and let x, be 
the limit of sequences of this class. We can identify the element ¢ of 
X with the indicated element 2, of X. The elements @, such that the 
corresponding classes are of the second type, are those elements of X 
which have not appeared in X. If and y are elements corresponding 
to classes of the first type, whilst x, and y, are the elements of X 
identified with them in accordance with the above, we can put 2, = 2, 
and Y, = Yq in (4) for any n, and obtain 


Q (a, y) = bi Q (29, Yo) = @ (Xo, Yo)» 


i.e. the new distance is the same as the old one for elements belonging 
to X.Ifz corresponds to a class of the first type (x, is the corresponding 
element of X), and y to a class of the second type, (4) gives 


0 (x, ¥) = lim Q (2, y,)- 
M00 


Let us further show that, if the sequence z, belongs to the class 
defining the element 2, then 0(%, %,) > 0 as n—> o0, 

We have by definition 0(%, z,) = lim 0(%m, %). But, since 2, is 

M+ oo 

a fundamental sequence, (3) holds and 0(%, 2,) < ¢ for n > N, ie. 
0(£, Zn) > 0 as n> ©, 

We now show that X is dense in X, i.e. if is any element of X and 
é is any given positive number, there exists an element x of X such 
that o(z, x) < e. If the class corresponding to < is of the first type and 
& is identified with the element x, of X,given any ¢ > 0 we can put 
Z = Xo, Since E(X, Xp) = o(Lo, Xp) = 0. Let # not belong to X, and let 
a fundamental sequence 2x, correspond to it. We fix an m such that 
0(%n, 2m) < € for n > m, and show that we can put 2 = zp. In fact, 
(x, Xm) = lim g(x, ®m), and, since 0(%p,,%m) < ¢ for n > m, we get 


tl+ 2 


O(2,%m) <e. 
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We now show that X is a complete space. Let ¢, be a fundamental 
sequence in X, i.e. o(&,, fm) < ¢ for n and m > N. We have to show 
that there exists in X an element # such that 0(z, %,) > 0 as n> ©. 
By what has been proved, given any n, there exists an element z, of 
X such that o(2,, %,) < lyn. It is easily seen that the sequence 2, 
of elements of X is fundamental: 


0 (Tay Bm) SO (Lys Lp) + O (Spr Lm) + OC (Fm» Xm) < 
<a t te lbn Fn). 


The sequence z, appears in a class defining some element z of X. 
Let us show that 0(%,%,)-> 0. This follows from the inequality 


0 (, &,) <0 (d, aq) + 0 (tm Ba) <0 (Hp) + — 


and the fact that (x, x,) > 0, as we have seen above. The complete- 
ness of X is proved. 

We must now prove a theorem that the completion of a metric 
space X is unique. 

THEOREM. The completion of a space X, such that X is dense in the 
new space, is unique up to isometric spaces. 

Let Y be a complete metric space containing X, in which X is dense. 
We have to show that Y is isometric with X. It is naturally assumed 
here that the distance between two elements of Y belonging to X is the 
same as in X. Let y be an element of Y. Since X is dense in Y, there 
exists a sequence 2, of elements of X such that o(y, %,) > 0 in Y, so 
that 2, is a fundamental sequence in Y and in X. A definite element z 
of X corresponds to this sequence. It is easily seen that z is independent 
of the choice of x,, the only important point being that p(y, 2,) > 0. 
We put @ in correspondence with the above-mentioned element y of Y. 
Now suppose that we have a definite element z’ of X. We take some 
sequence of elements x, of X defining it. This sequence is fundamental 
in the complete space Y and hence defines an element y’ of Y. It may 
easily be seen that y’ does not depend on the choice of the sequence 
Zn, the only important point being that it defines x’. We set y’ in cor- 
respondence with z’. As is easily seen, we establish by this means a 
one-to-one correspondence between elements of Y and X. It remains 
to show that o(z, x’) = oly, y’). 

This follows from the definition of 9(%, 2’) in X and the continuity of 
distance in Y: 

o(&, £’) = lim @ (%,, %n) =e (y, y’). 


ft— oe 
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We have dwelt in detail on the completion of a metric space because 
this process plays an important role in applications of the theory 
of metric spaces, and enables us to confine the discussion to complete 
spaces. Let us give three simple examples. 

(1) Let X be the space of all real rational numbers gq, y, z, ..., 
distance being defined by the formula o(x, y) = |a— y|. Clearly, 
(2, y) satisfies all three of the conditions in the definition of metric space. 
Let us take a sequence of real rational numbers z,. By Cauchy’s test, 
it must have a limit, though if this limit is an irrational number, the 
sequence has no limit in X, i.e. X is not a complete space. Completion 
of it implies bringing in all the irrational numbers, and the space x 
of all real numbers is complete. 

(2) Let us take the space C of all real functions x(t), y(t), z(t), ..., 
continuous in a finite interval [a,b], and let us define the distance 
e(z, y) by [14]: 

o (x, y) = max |x (t) — y(t)|. 
astsb 

This definition of (x, y) is easily shown to be permissible. The con- 
vergence 0(X, %,) > 0 is here the uniform convergence 2,,(t) > x(t) in 
the interval a < ¢ < b, and if | z(t) — z,,(t) | —> 0 as n and m—> o, 
there exists a continuous function z(t) such that 2,(t)—> a(t) uni- 
formly [I; 144], i.e. C is a complete space. 

(3) We now take the space F of the same functions continuous in 
(a, 6), but with a different definition of distance: 


b 1 
o(x, y) = | feo - yiopearP. (5) 


This is in fact permissible. We take the fundamental sequence 
{x,(t)} of F: 


i) \w,, (t) — 2m (t) |? dt + 0 asn and m->co. 
a 
It has a limit in the sense of metric (5) [56], but the limit function 
may be any function of L,, since continuous functions are everywhere 
dense in L, [60]. If the limit function is not equivalent to a continuous 
function, the sequence, fundamental in F, has no limit in F, ie. the 
space F is not complete. Completion of it gives functions of Z,, not 
equivalent to continuous functions, and transforms F into J,. 
Instead of functions of one variable, we could have considered the 
set of functions z(t,, t,, ..., tn) of n variables, continuous in a bounded 
closed set of n-dimensional space. 
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It must be pointed out again that, when completing an actual 
metric space, it is important to be able to interpret the actual meaning 
of the new elements obtained from the completion. In the last example 
these were functions of LZ, not equivalent to continuous functions. 
Another point: as we have seen above, the space L, can be considered 
on any measurable set. We have discussed the case of a bounded closed 
set because we started from the space F of continuous functions. 

Our next theorem holds in complete metric spaces. In future we 
shall write S(xz, r) for an open sphere in X with centre x and radius r 
and S(x, r) for the closed sphere. 

THEOREM. Suppose we have a sequence of closed spheres S(2p, rn) 
(n = 1, 2, ...) in the complete metric space X, such that each successive 
sphere belongs to the previous one and the radii r, > 0 asn— o. In this 
case there is a unique point belonging to all the 8(ap, Tp). 

By hypothesis, S(tnzp, Tr+p)C (tn, tr) (p > 0), so that 
0(Ln+p, Ln) < 2r, for any e > 0, i.e. the sequence z, is fundamental, 
and has a limit, say 7), since X is complete. We take a fixed sphere 
S(2n, Tn) and show that x, € S(2p, 7p). 

In fact, all the elements of the sequence 2p, p44, ..-, having Xp as 
limit, belong to S(%p, Tn) by hypothesis, i.e. 8(%p, 7;,) is a closed set, 
and 2, € S(%p, Tn). 

Suppose now that there exists an element xg belonging to all the 
S(2n, Tn). Let us show that x = 2. Since z, and 2 belong to all the 
S(tq,7n), we have 0(%5, %) < 0(%, Xn) + O(Ln, 2%) < 2p. This in- 
equality gives in the limit (2%, 2) < 0, ie. o(%, 2) = 0, whence it 
follows that xq coincides with z,.The theorem is proved. Afurther point: 
every closed set U of a complete metric space X is also a complete metric 
space (with the assumption, obviously, that the distance o(z, y) in U 
is equal to the distance between x and y in X). All this follows from 
the fact that every sequence z,, fundamental in U, has a limit in X, 
and this limit must belong to U, since U is a closed set. 


86. Operators and functionals, The principle of compressed mappings. 
Let X and X’ be two given metric spaces. The correspondence 2’ = Az, 
relating definite elements x’ of X’ to elements 2 of X, is called an 
operator acting from X into X’. An operator may not be defined in 
the whole of X. The set of elements x of X on which the operator A is 
defined is called the domain of definition of A and will be denoted here 
by D(A). We shall denote the set of values of Ax by R(A). This is a 
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set of elements of X’. If R(A) is the whole of X’, the equation 
a’ = Ax (6) 


has at least one solution for any x’ of X’. Suppose that A establishes 
a one-to-one correspondence between D and R&, i.e. given different x 
of D(A), different x’ of &( A) are obtained by (6). In this case equation 
(6) has a unique solution of D(A) for every 2’ of R(A). 

Functionals represent a particular, but extremely important, case 
of operators. The latter are called functionals when X’ is the real 
number space, with the definition of distance of [85]: o(z’, y’) = 
=|a’ — y’|. The space of all complex numbers is also occasionally 
taken, with the same definition of distance. 

The following test shows if the equation x — Ax = 0 is uniquely 
soluble, in the case when X’ coincides with X. 

THEOREM. (The principle of compressed mappings.) If the operator A 
maps the complete metric space X onto itself, D(A) == X, and for any x 
and y of X: 

e(Ax, Ay) < ag (x, y), (7) 
where a is a number satisfying the condition 0 <a <1, the equation 
x = Ax has one and only one solution. This solution can be obtained as 
the limit of the sequence 

%, = Ad, @, = Ady, ty = Ady sey (8) 


formed with an arbitrary choice of the initial element x,. 
In the present case, D(A) = X and R(A)c X. We have 


0 (Zp, Xn41) = 0 (Ax, 1, AZ,) < ag (%,_1, Tn). 
On applying the same inequality to o(z,_,,%,) and so on, we get 
0(2n» Xn41) < a" * 0(z,, %) (n = 1, 2,3, ...), whence it follows that, 
with m > n: 


Q (ps Lm) < e (Zp, ®n+1) + @ (Vn4y> Xn+2) + sere oa Q (%m—1) Xm) < 


n-1 
<a t(ltat... at") 9 (x,, a.) < — olla, %). 


On observing that 0(%,, %,) = 0 and 0(2n, Lm) = (Lm, Xn), We See 
that @(%n, %7,) > 0 as n and m— oo, Since X is complete, the sequence 
Zn has a limit, which we write as 2) (%—=)2,). Let us show that 
Az, => Axy: 

@(Ax,, Axy) < ag (Xp__,, Ly) > 0, 


since Q(%,-;,%)) > 0. On passing to the limit in the equation z, = 
= Az,.,, we get 2) = Az,. It remains to show that the solution of 
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x = Ax is unique. Let x’ be a solution of this equation: x’ = Az’. 
We have to show that 2’ = a). We have (25, x’) = 0(Ax,, Ax’) < 
< ao(Xg, 2’), ie. (1 — a) o(zo, v’) = 0, whence (Zo, xv’) = 0, so that 
x coincides with z,. The theorem is proved. 

Note. Let U be a closed set of X. If D(A) = U, R(A)c U and 
condition (7) is satisfied (0 <a< 1), the theorem holds, where 
x, € U and every 2’ satisfying x’ = Az’ coincides with z,. We are 
assuming that x’ € U, since A is defined in U. 


87. Examples. Before turning to examples of the application of the principle 
of compressed mappings, some examples of complete metric spaces will be given. 

1. The space R, of all sequences of n real numbers. The distance be- 
tween elements 2(@;,@,,...,@,) and y(b,,6:,,...,5,) of R, is defined as 
follows: 


ate 3 igre by. (9) 
cof 


We can also define the complex space R,, of sequences of n complex numbers. 
In formula (9) for the distance we have to replace (ay, — by)? by | a, — by |*. 
This replacement also applies for later examples of spaces of sequences or 
functions. 

2. The space m of jointly bounded infinite sequences of numbers 
%(@,,_, ...), ie. given any element x of m, there exists a positive number 
m, such that |a;} < m, for any i. 

The formula for o(z, y) is 


@ (w, y) = sup | aq — by |. (10) 


Convergence in m is equivalent to a coordinate convergence, uniform with 
respect to the number of the component. 
3. The space s of all infinite numerical sequences, where 
. | &y — by | 
e(t,y)= > 
fy 2" (1+ | ay — bg i) 
The proof that this definition of g(x,y) is permissible follows the same 
lines as the proof below for the analogous function space S. 
Convergence in space 3, as in R,, is equivalent to a coordinate convergence. 
4. The space J, (p > 1) of infinite sequences of complex numbers a, such 
that 


(11) 


Slulk<+e, (12) 
=1 


where 
eo 1 
ete) =| ¥ lau bel (13) 
=1 


The triangle rule is obtained from Minkovskii’s inequality for sums with 
p> 1 [62] and is obvious with p = 1. 
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5. The space C of functions g(x), where x is a point of n-dimensional space 
R,, that are continuous on a bounded closed set ¢, where 


ey, ¥) = |p (x) — p (x) |. (14) 


6. The space M of functions y(x), defined on a (Lebesgue) measurable set 
é of R,. Equivalent functions are identified, and every function of M is bounded 
{or equivalent to a bounded function). 

The definition of distance is 

e(7,¥)= inf sup | () — p(e)|. (15) 
m(8,)=0 &~& 

The meaning of this definition is as follows: we exclude from @ a set &, of 
measure zero, define the strict upper bound of | g(x) — y(x) | on the remaining 
set € — &,, choose tho set @, in all possible ways and find the strict lower bound 
of the resulting set of non-negative strict upper bounds sup | 9(x) — y(z) |. 
Occasionally, (15) is written as nas 


e(P, Y) = erat max |¢ (x) — y (2) |. (16) 


If € is a bounded closed set, C is part of M and o(9, y) is the same for D 
as for M, i.e. C is isometric to part of M. 

7. The space § of all functions g(x) measurable on a measurable point set 
é of finite measure of R,, where 


ee = Tee) —v el 


Here and in future, we understand the Lebesgue measure and integral in 
the corresponding R,. 

8. The space L,(¢) (p > 1) of functions g(x) measurable on the measurable 
set € and such that 


{lea Pdr< to, 
¢ 
where 


1 
o(y, vy) = [fl e@)—yv(e) |? dzfP (18) 
é 


The distance o(p, y) has the properties mentioned in the axioms [62]. 

9, The space V{a, b] of functions g(x) of bounded variation on the closed 
interval a < x < 6, continuous from the right at interior points of the interval 
and equal to zero at x = a, where 


6 
e(y, ») = V | 9 (x) — va) |- (19) 
a 
If we dispense with the requirement g(a) = 0, 0(, y) is defined as follows: 
b 
e (9, v) = |y (2) — (a) | + V |e (z) — 9 (a) |. (20) 
a 


This widens the space, and the original space is isometric with part of the 
widened space. 
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All the above spaces are complete. 

We have proved that L, and l, are complete. The proof presents no difficulty 
in the other cases and will be omitted. 

Let us consider the space S in more detail. On observing that w(t) = 
= t/(1 +t) =1— 1/(1 + #) is increasing for t > 0, we can write 


ete] del+iel Cet 
P+lete] S 4+] Fie S414 


whence follows the triangle axiom for S. We prove now that convergence in 
S is equivalent to convergence in measure. 

Let (2) —~ g(x) in measure on @. Let us show that o(p, gn) > 0. We introduce 
the sets &,(6) = é{| p(x) — pp(z)| > 6]. By hypothesis, m[é,(6)]+ 0 as 
n-—» oo with any fixed 6 > 0. We have 


| = | 


1+ fr} ’ 


4+ 


| p(x) — Pn (x) | 


eo) = \ T+ Tea eae 
| @ (w) — on (2) | 
<fuert | space & 


€4(8) €—En(4) 


whence we obtain, since w(t) is increasing and | g(x) — 9,,(x) | < 4 on the set 
€ — &,(4): 


6 
@(P Pa) <™1En(0)| + Tem): 


Given «> 0, we can fix 6>0 such that m(é) 6/(1 + 6) < e/2. Further, 
an N exists such that m[é@,(6)] < «/2 for n> N, so that 0(9, y,) < e for 
n> N, i.e. e(P, Pn) ~ 0. Now let o(¢, ~,) > 0, and let us show that 9,,(7) > (x) 
in measure on ¢. By what has been said regarding w(t), we have | g(x) — 
— pplz) \/[1 + | p(x) — op(a) |] > OX1 + 0), if x € &,(4). Hence, 

6 ) 
@(P: Pn) > { Te as 
&n (8) 


where 6> 0 is assumed fixed. By hypothesis, o(y, 9,) + 0, and it follows 
from the last inequality that m[@,,(6)]—~ 0, which is what we needed to prove. 

By utilizing the theorem of [44] and what has been proved, we can say that, 
if 9(P, Pn) ~ 0 in S, there exists a subsequence 9,,(x)} such that 9,,(x) > (x) 
almost everywhere on &. The proof that S is complete is essentially the same 
as the proof for L,. 

We could have used the Lebesgue-Stieltjes measure and integral in forming 
the function spaces L,, M and S. 


88. Examples of applications of the principle of compressed mappings. 
1. We take the system of n equations with » unknowns: 


n 
f= 4X au be + Or (21) 
=1 


(¢=1,2,...,%) 
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where A is a numerical parameter. We shall regard the right-hand sides as the 
operator Az from #,, into #,, applied to the element w(é,, &, ..., &,) and acting 
in the whole of R,. We have from Cauchy’s inequality: 


n 1 
@ (Az, Ay) <{A| 2 | Giz re (x, y). 
=] 


Thus the principle of compressed mappings will be applicable in R,, if 


1 
_1 
[al < 2 lau | ga 


1kK=1 


2. We take the infinite system of equations 


= ad Kn Ee +O; (22) 
La 


Gat, BS ikl 


where the sequence (8,, b., ...) is regarded as an element of m. If 


sup Pp | Qi, | = 
i kay 
is a finite positive number, the right-hand sides of (22) yield an operator A 
from m into m, defined in the whole of m, and the principle of compressed 
mappings is applicable if | 4|c <1. If (6,,6,,...) is an element of J, and 


> |exP=d<to, 
(,k=1 
the right-hand sides of (22) yield an operator from I, into J,, defined in the whole 
of 1,, and the principle of compressed mappings is applicable if | A|d <1. 
Notice that the solution is unique in these spaces, but that solutions may exist 
that do not belong to the spaces. 
3. We take the integral equation (one-dimensional case) 


6 
p(x) = AS K (x,t) p(t) dt-+f (x), (23) 
a 
where [a, 0] is a finite interval and K(z, t) is continuous in the square Q(a < 
<2< b;a<t< b). If f(x) is continuous on (a, d], the right-hand side of (23) 
is an operator from C [a, b] into C[a, b], defined on the whole of C, and the prin- 
ciple of compressed mappings is applicable to equation (23) if 
6 
[|4| max § | K (x, t){dt<1. 

asxgba 
If K(x, t) € L, on Q and f(x) € LZ, on [a, 6] (the interval may also be infinite), 
the right-hand side is an operator from L,[a, b] into L,[a, b], defined in the whole 
of L,[a, 6], and the principle of compressed mappings is applicable to (23) if 


66 Ly 
ial| Sic |? dadt |? < 1 
aa 
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What has been said also holds for multi-dimensional integral equations. 
4. We take the non-linear integral equation: 


fa 
p(x) =A S K[a,t,  (t)] de, (24) 


where [a, b] is a finite interval, K(x, é, z) is a continuous function of its arguments 
fora<xz<b,a<i< b and |z| <C, where C is a given positive number. 
Given any choice of function g(t), continuous for a < t < b and satisfying the 
condition | g(t) | < C, K[x, t, p(t)] is a continuous function of (2, ¢) in the above- 
mentioned square Q. Let | K(z,t,z)|<d for (zt) €@Q and |z|<C. If 
] 4] d(6 — a) < C, the right-hand side of (24) is the operator Ag into C[a, 6], 
for which D(A) is the sphere e(0, y) < C, where 0 is the continuous function 
equal to zero in [a, b], and R(A) belongs to the same sphere. Notice that we can 
write (0, ~) < C in the form | g(x) | < C. Suppose, further, that the kernel 
K(x, t, z) satisfies a Lipschitz condition with respect to the third argument, 
i.e. 


| K (a, t, 2) — K (a, t, 2) | < N |Z, — 22], 
if (a, t) € Q, whilst |z,| and | z,| <C. Now, 
e (Ag, Ay) <|A|N (6 — a) (9, 9), 


so that, when the conditions 
|a|d(b—a) <C and |A|N(b—a) <1 


are satisfied, the principle of compressd mappings is applicable to equation 
(24) in the above-mentioned sphere. This equation has a unique solution in 
the sphere, which can be obtained by the method of successive approximations 
with any choice of initial approximation 9,(x) in the sphere. This method gives 
a uniform convergence of the approximations to the solution in the interval 
[a, 6). 

5. Let D be a domain of three-dimensional space, bounded by the Lyapunov 
surface S. Let us take the boundary problem for the elliptic equation: 


Au — Af (x, y, 2, u) = 0 inside (D),' (25) 
u|,=0, (26 ) 


where A is the Laplace operator. We assume that f(x, y, z, u) is continuous in 
the four-dimensional closed domain of space (2, y, z,u) corresponding to the 
variation of (x, y, z) in the closed domain D with | u| < c, and has continuous 
derivatives with respect to its arguments inside this domain, the derivatives 
being continuous as far as the boundary. We suppose further that | f(x, y, z, u)|< 
< d for (x,y,z) € D and |u| < ¢ and 


|f (@, ys%, U,) —F (a, y, 2, Ue) | < N fu, — ue | 


with the conditions indicated (| u,| and | u,| <c). Let G(x, y, z; §, 7, ¢) be 
Green’s function for the Laplace operator for the domain D with boundary 
condition (26) [IV; 220]. We introduce the points P(x, y,z) and Q(&, 7, ¢) of 
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D. The solution of problem (25) and (26) is equivalent to the solution of the 
integral equation 


u(P) = ASG (P3Q)f(Q u(Q)] drag (27) 
D 


in the space O(D) of functions u(Q) continuous in D [IV; 224]. We know that 
G(P; Q) > 0 in D [IV; 221], and that there exists a finite 


max {G (P;Q) dtg =Q,. 
PéD D 
If |4|@,d < c¢, the right-hand side of (27) is the operator into C(D) for 
which D(A) is the sphere @(0, %) < ¢ in C(D) (ie. | u(P) | < cin D) and R(A) 
belongs to this sphere. The principle of compressed mappings is applicable to 
the equation (27) if |A| NG, <1. Thus, when the conditions 


|A|Gd<e and [A| NG, <1 


are satisfied, problem (25) and (26) has a unique solution in the sphere | u(P) | < 
< c. It can be obtained by the method of successive approximations, applied 
to (27) with any choice of initial approximation from the sphere, and the 
approximations are uniformly convergent to the solution in D. 


89. Compactness. The idea of compactness has already been intro 
duced for a particular case [IV; 36]. We now discuss this concept for 
a general metric space X. A set U of elements of X is said to be com- 
pact in space X or simply compact, if any sequence of elements 7, 
contains a convergent subsequence. If, in addition, U is closed, it is 
said to be mutually compact. 

It may easily be seen that a necessary condition for the set U to be 
compact is that it be bounded. For, if U is unbounded, there exists a 
sequence 2, of U such that o({a,2,) > + 00, where a is any fixed 
element. It is impossible to extract a convergent subsequence from 
this sequence, since every convergent sequence is bounded. The con- 
dition that U be bounded is also sufficient for it to be compact in FR, 
[IV; 15]. This is not true in the general case of a metric space, and we 
next establish the necessary and sufficient condition for compactness. 

We must first introduce a new concept. We shall say that a set U 
has a finite e net, where « is a given positive number, if there exists a 
finite set 2, (k = 1, 2, ..., 1) of elements of X such that, given any x 
of U, an element 2, of our set can be found such that o(z, 75) < «. 
Notice that the elements x, may not belong to U. 

THEOREM. The necessary and sufficient condition for a set U of elements 
of a complete metric space to be compact is that, given any ¢ > 0, tt has 
a finite e net. 
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Necessity. Suppose that, for some ¢, > 0, there is no finite e, net 
for U, and let us show that U is not compact. We take some element 
x, € U. It can be asserted that there is an element 2, € U such that 
0(%,, 2) > €9. For we should otherwise have 0(%,, 2) < €,) for any 
xz € U, and an element x, would give an e, net for U. Further, an 
element z, exists such that o(z;, 23) > €) (¢ = 1, 2), since otherwise 
elements z, and x, would give an «, net for U, and s0 on. 

We thus obtain an infinite sequence of elements z, of U such that 
Q(Xp, Lg) > Eq for all p # g. Given any subsequence a, (k = 1, 2, ...) 
we shall also have 0(%p,, Un,) > &9 for ny # 7m, so that no subsequence 
of x, can be convergent, i.e. U is not compact. 

Sufficiency. Suppose that U has a finite « net for any « > 0, and 
let x, be a sequence of elements of U. 

We have to show that a convergent subsequence can be extracted 
from it. If the elements x, coincide with the same element y for an 
infinite number of values of n, y, y, y, ... is a convergent subsequence. 
Suppose that we do not have this situation. Now, if we leave in the 
sequence z, only one element of the group of equal elements (say the 
one with the least index), we get a sequence of different elements. 
Tt can be assumed that the original sequence already has this property. 
We fix some positive number «. Since U has a finite «/2 net, there 
exists a finite number of closed spheres of radius ¢/2 such that all the 
elements of U, and hence all the z,, belong to these spheres. An infinite 
set of 2, belongs to at least one of them. 

Let us denote one of these spheres by S,(«/2). Further, there exists 
a finite number of spheres of radius ¢/2?, to which all the 7, of S,(¢/2) 
belong. Of these, we take the sphere S,,(¢/2?), which contains an infinite 
set of the elements in question. Similarly, there exists a sphere S,(«/2°) 
of radius ¢/2*, containing an infinite set of elements z, belonging 
simultaneously to S,(e/2) and S,(¢/2?). On continuing in this way, we 
get an infinite sequence of closed spheres S,(¢/2") such that the radius 
of S,(e/2") is «/2", and S,(e/2*) contains an infinite set of elements 2, 
belonging simultaneously to all the spheres S,,(e/2”), where m < k. 
We take an element 2, from each of the spheres S,( e/2"), where it can 
be assumed that n, > n, for 1 > k. We thus obtain an infinite sub- 
sequence Zp, of the sequence z,. On observing that, by the triangle 
axiom, we have o(z, y) < 2r for any two elements z and y belonging 
to the same sphere of radius 7, we can say that 


P 
O (Xn, La) < pha for N, > Ny. 


90] COMPACTNESS IN ¢ 273 


Hence it follows, since the space is complete, that x,, is a convergent 
sequence. The theorem is proved. 

Note 1. It is sufficient for compactness that there exists merely 
2 compact, and not a finite, ¢ net for any e > 0. This means that, 
given any e« > 0, there exist spheres of radius ¢, containing all the 
elements of U, the centres of which form a compact set. Let U, de- 
note this set of centres. By the theorem (necessity), there exists for U, 
a finite « net, and it follows at once from the triangle axiom that this 
net will be a finite 2e net for U, whence it follows, in view of the 
theorem (sufficiency) and the fact that « is arbitrary, that U is a 
compact set. 

Note 2. Notice that U can coincide with X, so that we can speak 
of the compactness of the whole of X. It is easily shown, from the fact 
that distance is continuous, that every compact space is complete. 
Thus every set U compact in itself of elements of X is a complete 
metric space. 


90. Compactness in C. Let C be the space of functions continuous 
in the finite interval [a, b], and U some set of elements of C. We have 
seen that the sufficient conditions for U to be compact are that the 
functions of U be bounded and equicontinuous [IV; 16]. Let us show 
that these conditions are necessary. Let U be compact. By our theorem, 
given any e > 0, there exists a finite number of functions ¢,(t), 9,(¢), 
...,9,(t) of C such that, for any function g(t) of U, we have | g(t) — 
— g,(t) | < ¢/3 fora < t < b, where 9g,(é) is one of the above-mentioned 
functions. Since there is a finite number of such functions, there exists 
for all of them a positive 7 such that 


[P(t +h) — % (|< forlh|<m (k=1,2,...,p), 
(t and ¢ + hé [a, b]) 


where 7 depends only on «. 
We get from this: 


lp(E+h) —9()|<|[p(+h)—9, (+A) [4+ 


When || < 7, all the terms on the right-hand side are < ¢/3, so 
that | y(t + ) — o(t)| < « for || < yn, and the equicontinuity of 
the functions of U is proved. The fact that they are bounded is an 
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immediate consequence of the fact that compactness is a necessary 
condition for U to be bounded [89]. This criterion for compactness is 
proved in precisely the same way for functions of several variables, 
defined in a finite closed domain of R,. If the functions are defined 
on a bounded closed set, the proof is the same in essence. 


91. Compactness in L,. Let us take the space L, of functions 
g(x, y) on some measurable bounded set # of the (2, y) plane. We shall 
assume in future that all these functions are continued by zero outside 
& and that integration is carried out over the whole plane. The integrals 
in fact reduce to integrals over a bounded measurable set. 

THEOREM. The necessary and sufficient condition for a set U of elements 
of L, to be compact is that all the functions g(x, y) of U satisfy the following 
two conditions: 

1. There exists a C > 0 such that 


lell=[fle(ay) dxdy]> <C (boundedness), (28) 
z 


where || ¢ || is the notation for the left-hand side of (28). This quantity 
is called the norm of ¢(z, y) in L, on & [62]. 

2. Given any ¢ > 0, there exists an 7 > 0, the same for all 9(z, y) 
of U, such that 


Il One ll = [flel@thy+h —o(a,y)Pdadypr <e 
for Vh? +k? < 7. (29) 


We know that, given ¢ > 0, there exists for every fixed function of 
L, an n > 0 such that (29) holds (continuity in the mean) [70]. This 
is also obviously true for a finite number of functions »,(z, y) (k = 1, 
2,..., 7) of L,. It is sufficient to take the least of the 4 corresponding 
to the »,(z, y). Property (29), which must hold for all the y(a, y) of U, 
may be termed equicontinuity in the mean of all functions of U. 
Notice also that g(x + h, y + k) is a measurable function and that 
g(x +h, y + k) — (x, y) = 0 outside some bounded measurable set. 

The necessity of the conditions is proved in the same way as for C. 
The only change is in taking || » — y || = e(¢, y) in LZ, in place of the 
absolute value of m — y. For, boundedness (28), which can be written 
in the form o(0, y) < C, as we know, is necessary for compactness. 
Further, the compactness implies the existence of a finite e/3 net, i.e. 
a finite number of functions ,(x, y) (kK = 1, 2, ..., m) of Z, such that, 


91] COMPACTNESS IN Ly 275 


given any 9(2, y) € U, there exists a y,(z, y) such that || o — y, || < 
< ¢/3. There exists for all the »,(z, y) an 7 > 0 such that 


Sn ¥sl| <= for VR R <7. (30) 
We can also write: 


g(athy+k)—elay)|<|p@+hy th) —e(ethy +h) + 
ig | ps (% + hoy + k) — 9, (x,y) | Ae |p (2, Y) _ p(x, ¥) | 

and, in view of (30) and || 9 — ys || < ¢/8, we get (29) on applying 
Minkovskii's inequality with p > 1. When p = 1, (29) is obtained at 
once. 

Let us now prove the sufficiency of conditions (28) and (29). Let 
| g(x, y) denote the mean function for y(z, y) [71]. We recall (178) of 
[71]. Given any p > 1, it takes the form 


Cc 
lo—vel? <S [ [flea —9G+u,n+0) Pp dgdn| dude, (31) 
uo pt 
where C, is a constant. By condition (29), given any « > 0, there 
exists an 7 > 0, the same for all y(z, y) € U, such that 


p 
flee — 9 (E+ a0 +0) Pagan < Gq for w+ or <p 
and inequality (31) gives us, for any (zx, y) € U: 


|? — || <¢ fore <y, (32) 

The norm on the left is taken over the entire plane @., (actually, 
over a bounded set). All the more, || » — q, ||, < e fore < n. Having 
fixed 9 < 7, we can say that the functions g(z, y) form an e net for 
the set U of functions 9(z, y). Let 4d (a <u <b;¢ < y < d) be an 
interval containing . By condition (28) and Theorem 8 of [71], we 
can say that the set ,(z, y) is compact in C on A, and all the more, 
compact in L, on &. Hence the functions 9,(z, y) form a compact « 
net for U, and we can say, since ¢ is arbitrary, that the set U is com- 
pact. The sufficiency of (28) and (29) is proved. 

We now take the case when @ is the entire plane @,,. The above 
proof now loses its force, since the set of functions p,(z, y) may be non- 
compact. The necessity of conditions (28) and (29) for compactness is 
proved as above. But these conditions are not sufficient. We have 
to add a further condition, viz: given any « > 0, there exists a positive 
N, the same for all g(x, y) of U, such that 

J |p (a,y) Pda dy < e?, (33) 


Ex — Ay 
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where A,, is the interval (—m < « < m; —m < y < m). Notice that, 
if condition (33) is fulfilled for a certain N, it is preserved when V 
increases. 

Let us prove the necessity of condition (33). If U is a compact set, 
it has an e/2 net of functions y,(z, y) (k = 1, 2, ..., n). Condition (33) 
is satisfied for each of these functions, and, since there is a finite 
number of them, given any « > 0, there exists an N such that 


p 
{ py (2, y) Pdxdy <3 (k=1,2,..., 2). (84) 
€a—Ay 
We take some g(z, y) € U. There exists a g(x, y) such that || ¢ — 
— Qs |le,, < e/2. It follows from 


| ? Heo —4y < lly = Ps |le_,—4y + I Pallexcays 
together with (34) and || p — @ lle,-4, < i] @ — 9s lle, < e/2, that 


1 
Il? len 49 = | ih | (x,y) P da dy? < 5+y=6 
fo—4y 
which proves (33) for any g(x, y) € U. 

Let us now prove the sufficiency of (28), (29) and (33). Let these 
conditions be satisfied for functions g(x,y) € U, and let (zx, ¥), 
p(x, y), ... be any sequence of functions of U. We have to show that 
we can extract from it a subsequence which is convergent in L, on @,,. 
It follows from (28) and (29) that a subsequence Gh. (2, y), P(x, Y), 

.., can be found which is convergent in L, on 4,. We can extract 
from this latter a new subsequence pS (2, 4) oa, y), -.., Which is 
convergent in L, on 4,, and so on. We form the subacquents 


gn (a, y), Pr (x, y), Gn) (X,Y), +s (35) 


which is a subsequence of the original sequence 9,(z, y). If m is any 
positive integer, all the terms of sequence (35), as from oy (x, y), 
belong to the sequence oa, Y), h(a, y), ..., which is convergent 
in L, on 4, i.e. sequence (35) is convergent in Z, on any finite interval 
Am (m = 1, 2, ...). We show that it is convergent in L, on 2.. also. 


We consider the integral: 


| eal? — pr? [2 . = , | pif (2, 4) — Pn? (ae, y) |? daw dy + 


+f pai (x, 9) — pnt? (ae, y) |P dx dy. 


é oo —4in 
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On using the obvious inequality |a+y|? < 2?|a|? 4+ 2? | y/)?, 
we get 


|| enh — pl ||P < J | pal (x, y) — nf? (x, y) |P da dy + 
4n 


+2? fl gnf (x,y) |Pdady + 2? Jf | gal? (a, y) |? dxdy. 
Fa —Ay Ex — Ay 
By condition (33), given any « > 0, there exists an m such that the 
sum of the terms on the right-hand side, apart from the first, is less 
that «°/2. Having fixed such an m, we get 


P 
| pale — gal 12, < | | pall? (2.9) — pnll? (2,y) |? de dy + 
Am 


But it follows from the convergence of sequence (35) in L, on 4m 
that the integral on the right is not greater than «?/2 for all sufficient- 
ly large g and r, and hence there exists an M such that 


| pr — ph ||2 <e? for g and r>M. 


i.e. sequence (35) is mutually convergent in L,(@..), and, since L,(@ ..) 
is complete, this sequence has a limit in L,(@..). The sufficiency of 
conditions (28), (29) and (33) is proved. It easily shown that the last 
condition is not a consequence of the first two. 

We have considered the case of a plane for definiteness. Everything 
said obviously also holds in any space R,. On writing z(%,, X, ..., Xp) 
for a point of this space and introducing the notation dz = dz, daz, 
... d%,, we can write (28) and (29) as 


[Slo am) Pdze<e (36) 
¢& 


i 
[Sle (ety —v(a)|dx}?<e for |y| <7, (37) 


é 


wherey has components (y,, Yo,---, Yn) and | y| = Vy? + y2 4... +y?. 


92. Compactness in /,. Let us prove the following theorem: 

THEOREM. The necessary and sufficient condition for a set U of elements 
of |, (p > 1) to be compact is that all the elements x(&,, &, ...) of U 
satisfy the following two conditions: 

1. There exists a number C > 0 such that 


c° 1 
|| @ |} = (> [é,/?)? <C (boundedness). (38) 
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2. Given any « > 0, there exists a positive integer n,, the same for 
all « € U, such that 
°° 1 
(EP)? <e. (39) 
l=ng 
Necessity. As we know, boundedness (38) is a necessary condition 
for compactness. Further, the compactness of U implies the existence 
of a finite number of elements x, (k = 1, 2, ..., m) of 1, such that we 
have for any x € U: 0(z, x5) = || % — a, || < €/2, where 2, is one of 
the elements 2,. There exists for the elements 2; (¢, &%, ...), since the 
number of them is finite, a positive integer n, such that 


2 1 
(Sle) <s- (40) 
t= 

Hin vee, mM) 


But it follows from || x — 2, || < ¢/2 that 


(S1h— Py <CS1&— APP < yz 
t=1 


l=Ne 


and we obtain, on applying Minkovskii’s inequality for sums (p > 1): 


co 1 ) 1 oo 1 
(Sal) <( Sa — Pye t+ (Sle Py <e, 
l=rg l=Ne l=ng 
which proves that condition (39) is necessary. The proof makes no 
use of Minkovskii’s inequality when p = 1. 

Sufficiency. Suppose that the elements of U satisfy conditions 
(38) and (39); let us prove that U is compact. Given « > 0, we associate 
with each element (¢,, &,...) of U a cut-off element (é,, &,,..., 
é,,-1, 0,0, ...), and write U, for the set of these cut-off elements. 
It follows from (39) that there exists for any  € U an element y € U, 
such that || 7 — y || < ¢, ie. the set U, is an e net for U. It remains 
to show that U, is a compact set [89]. The proof is analogous to the 
proof that every set bounded in #, is compact. 

By (88), we have | &, | < C for any component of the elements U of 
U,. We can extract from any sequence of elements of U, a subsequence 
for which the first (n, — 1) components have a finite limit. The re- 
maining components of these elements are zero, whence it follows that 
the subsequence in question is convergent in J, to an element for which 
all the components £, are zero for s > n,. The compactness of U, is 
therefore proved. It follows from the present theorem that the sphere 
[| || < 7 in Z, is non-compact. 
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93. Functionals on mutually compact sets. Let the functional [(z), 
which takes real values, be defined on the mutually compact set U of 
the metric space X. It is said to be continuous if %=—> 2, implies 
Uxr,) > U(x). 

A theorem holds for such functionals, analogous to the theorem on 
continuous functions on bounded closed sets of space Ry. 

THeorEM 1. Z/f U is a mutually compact set of space X and I(x) is a 
real continuous functional on U, it is bounded and attains its strict lower 
and upper bounds on U. 

We shall only prove that I(x) is bounded from below and attains its 
strict lower bound. The boundedness is proved by reductto ad absurd- 
um. If the set of values of I(x) were not bounded from below, there 
would exist a sequence of elements x, of U such that l(z,) + — °°. Since 
U is compact, we can extract from the z, a convergent subsequence 
In, => Lo, and in view of the mutual compactness, xz) € U. Now, 
since U(x) is continuous, we have ((x,,) > U(%o), which contradicts 
U(2n,) > —oce, since U(2,) is a finite number. 

Let a be the strict lower bound of the set of values of I(z) on U. 
There now exists a sequence of elements z, € U such that a < I(t) < 
<a-+ I/n. As above, we can assume that %,,=> 2%, where 2p € U, 
and consequently [(z,,) —> U(z,). But it follows from a < U(%,,) < @ + 
+ fn, that U(2,,)—> a, whence U2 9) = a, which is what we wished 
to prove. 

We introduced above the concepts of lower and upper limits of real 
number sequences a, (n = 1, 2, ...). Let us bring in the notation for 
these: 


S = lima,; T = lima,. 


These limits may be equal to +° or —°o. 

If a sequence a, has a limit, S and T coincide with this limit. In 
addition, it follows from the definition of S and 7' that no subsequence 
Gn, of the sequence a, can have a limit which is less than S or greater 
than 7, but there is at least one subsequence which has the limit 8, 
and one with the limit 7. The functional I(x) is said to be semi-con- 
tinuous from below on U if x,=» x, implies that lim Uz) > U(x), 
and is semi-continuous from above if x, =» Z) implies lim U(z,) < Ux). 

Let us prove a generalization of Theorem 1, important in applications. 

THEOREM 2. A functional I(x), defined on a mutually compact set U of 
a metric space and semi-continuous from below (above), is bounded from 
below (above) and attains on U its strict lower (upper) bound. We take 
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a functional semi-continuous from below, and prove as in Theorem 1 
that it is bounded from below. The assumption that U(x,) —~ — oo 
leads to a subsequence 2,,=> Z, where 2 € U and U2,,)-~ — co. 
But, in view of the semi-continuity from below, lim [(z,,) > U(x»), 
where 1(x,) is a finite number, which contradicts [(2,,) > — co. 

Let a be the strict lower bound of the set of values of I(x) on U. As 
in the proof of Theorem 1, we get a subsequence 2,,-> %) anda < 
< U(xp,) < a + 1/n,. It follows from the first that lim U(x,,) > Ux), 
and from the second: lim U(z,,) = a, whence I(x.) < a. But a is the 
strict lower bound of values of U(x), so that [(z,) = a, which is what we 
set out to prove. 


94. Separability. A metric space X, containing an infinite set of 
elements, is called separable if there exists a denumerable set of elem- 
ents of X: x, %, ..., dense in X, ie. for any x € X and any e > 0, 
there is an element 2, of the set mentioned such that o(z,a,) < «. 

We proved above the separability of 1, and Lp (p > 1) [59, 60]. In 
space C, the set of all polynomials with rational coefficients is an 
example of the denumerable set. An example in space #, is the set of 
elements (@,, @, ...,4,), for which all the a, are rational or (in the 
case of a complex space) have the form a, = a, + 7,, where a, and f, 
are real rational numbers. 

In space s, the denumerable set is the set of elements of the form 
(@,,Q@, -.+,@n, 0, 0, ...), where all the a, are rational numbers. 

Let us show that space m is not separable. We take the set U of 
different elements 7(a,, @,, ...) of m such that the a, are zero or unity. 
On assuming that a, is the kth digit after the decimal point of a number, 
written on a numeration system to base 2, we see that the set U is 
non-denumerable. On taking into account what was said in [1], it is 
easily seen that it has the power of a continuum. We have o(z, y) = 1 
for any two distinct elements 2 and y of U. Let space m be separable, 
i.e. there exists a denumerable set 2; (k = 1, 2, ...) of elements of m, 
dense in m, and let S, be the sphere with centre x, and radius 1/3. 
The set of these spheres is denumerable, and at least one of them 
belongs to more than one element of U. Let y and z be distinct elements 
of U, lying in the same sphere. We have: o(y, z) < 21/3, which contra- 
dicts (x, y) = 1, and proves that m is not separable. 

THeorEM. Every set U of elements of a separable space X is separable. 

We have to prove the existence of a finite or denumerable set of 
elements of U, dense in U. Since X is separable, there exists a de- 
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numerable set of elements 2, (n = 1, 2, ...) of X, dense in X. Let 
S(an, 7) denote the sphere with centre z and radius r. We consider the 
spheres S(ap, 1/2") (k = 1, 2, ...), and, if a sphere contains elements 
of U, we choose one of these elements. We obtain in this way a finite 
or denumerable set wu, (m = 1, 2, ...) of elements of U. Let wu be any 
element of U and «¢ a given positive number. We must show that 
|| & — Um || < e for at least one of the w,,. We can assume ¢ < 1, 80 
that there-exists a positive integer / such that 
1 1 

Since the set of z, is dense in X, there exists an n = n, such that 
|| @ — tp, || < 4 < 1/2', whence it follows that the sphere S(z,,,, 1/2') 
contains elements of U. Let u, be the element that we have chosen 
from this sphere (it may not in fact coincide with u). Since u and up € 
€ S(t, 1/2'), we have || u— up || < 1/2'"* < «, which is what we 
wanted to prove. 


95, Linear normed spaces. We shall now introduce abstract spaces 
which are metric but also have certain other properties. As above, we 
shall denote elements of a space by the last letters of the alphabet 
x,y, 2, ..., and numbers by the first letters a, b, c, ... These numbers 
may be regarded as either real or complex. In the first case we have a 
real space, and in the second a complex. We shall in future consider 
complex spaces, unless there is some special proviso. 

The set X of elements z, y,z,... is called a linear space if its 
elements satisfy the following axioms. 

Axiom A. Elements of X can be multiplied by a number and added, i.e. 
if x and y are elements of X and a is a number, ax and x + y are also 
definite elements of X. 

The operations mentioned are subject to the following laws: 


(lI) t+y=yto; (2) e+ (yt2)=(@+y) +2; 
(3) a(w@+y)=ax+ay; (4) (a+b)2 =ar + ba; 
(5) a(bx) = (ab) a; (6) lza=@; 


(7) if e+ y=a2+2, then y=z. 


We must introduce the concept of zero element. Let 2 and y be any 
two elements of H. We shall prove that Ox = Oy. Let us write Ox = 0 
and Oy = @,. We can write, on using laws 4 and 6: 


e+ 0=12+0¢=(14+0)t=la=2 
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and similarly, y + 6, = y. Further, we have from laws 1 and 2: 
(ety) +0=(4+0)+y=27+y, 


and similarly: (x + y) + 0, = z+ y, whence it follows that (x + y) + 
+60=(x2+ y)+6,, and we obtain by (7): 6 =6,. Thus multipli- 
cation of any element by the number 0 gives us the same element, 
which we call the zero element, and denote by the symbol 0. The follow- 
ing simple corollaries of the above laws are easily verified. Given any 
complex a, a0 = 6. If ax = 6 and a ¥ 0, then x = 6. If ax = bx and 
x#0, then a=b. If ax = ay and a#0, then x= y. We denote 
(—1)z by the symbol (—z). The difference x — y is defined by 
c—y=2%+(—Y) 

It is easily verified that the ordinary rules of algebra also hold for a 
difference. We shall in future simply write 0 for the zero element. This 
will not cause confusion with the number 0 if proper attention is paid 
to our later equations. If one side of an equation is an element of X, 
and the other side contains 0, this must be regarded as the zero 
element of X. 

DEFINITION. The elements x,, 2p, ..., 2m are said to be linearly inde- 
pendent if the equation 


Cy, + lg Me t+... + o_ tm = 0 


can hold when and only when all the numbers c, (k = 1, 2, ..., m) are 
zero. 

For the n-dimensional complex space considered in Volume III, the 
maximum number of linearly independent elements is equal to n. An 
axiom is sometimes introduced which excludes the possibility of a 
finite-dimensional space. 

Axiom B. Given any positive integer n, there exist n linearly inde- 
pendent elements. This will not play an important role below. Let us 
introduce one further axiom. 

Axiom C. Each element x has associated with it a definite real non- 
negative number || x ||, the norm of the element, and this norm must 
satisfy the following three conditions: 


(1) |[@|]=0 and |/z||/>0, for x #8, 
(2) [=+yl|<]ell tly, 


(3) |Jax||=|e|-lleil, 
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where a is any number and | a | is the modulus of a. It follows from the 
second and third properties of the norm that || —z || = || z||, and 


le—yll>llell—iylls le—yll>llyl—llel, 
|e —yll> [el] — yl. (41) 


The distance between two elements is defined by o(x,y) = ||z — y ||, 
and (2, y) is easily seen to satisfy all three conditions of the definition 
of metric space, i.e. every linear normed space is at the same time a 
metric space, so that everything said about metric spaces holds for 
linear normed spaces. The norm can be expressed in terms of distance 
by the obvious formula || x || = e(z% — 6). 

If we add the requirement that the space be complete, a linear 
normed space is called a type B or B space. Everything below refers 
to B spaces. 

An incomplete linear normed space can be made complete by the 
completion of [85]. The norm is defined for an added element by 
\| 2 || = e(2 — @). All the axioms are retained on completion, and 
in particular, axiom A. This latter follows from the continuity 
of the sum x + y and the product az, which will be discussed below. 
We shall encounter later convergent sequences of numbers and 
elements. As above, we shall write a, — a, for convergent sequences 
of numbers, and 2, = 2, for convergent element sequences. 

The convergence 2,=> 2 is equivalent to |] 7) — 2, || 0. We 
can consider in B infinite series u, + u,+u,;-+ ..., where u,€ B 
(k = 1, 2,...). We shall write 2, = u, + Ug+ ... + Un. 

If the sequence 2, of elements of B has a limit x), we say that the 
series indicated is convergent and has the sum Zp. 

Let us show that «+ y and az are continuous, ie. if r=» 2p, 
Yn=>Yy and a,-—>a), then 2+ 4,5 %+ Yo, and a,%,=>> 
=> Ay ZZ. We have 


I] (0 + Yo) — (Ba + Yn) || < || Zo — tn || + || Yo — Ynll- 


The right-hand side tends to zero, so that the left does the same, i.e. 
+ An XZ) — An Lp, and we have 


|| @o ao — Ay, %, I < || @p Xp — A, Lo ||+|| An Xo — Ay Lp | = 


= |49—4n | || Zo |] + [an | — Sal, 
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the right-hand side of which tends to zero, since a, — a, implies that 
|@,| is bounded. Notice further that, if 7,—2,, then || x, || > 
—> || Zp ||. This follows from || 2, || = @(%,, 8) and the continuity of 
the distance. We define a lineal in B as follows: a set of elements of U 
is called a lineal when the condition is satisfied: if x, ¢€ U (k = 1, 2, 

..,m), any linear combination c, 2, + ¢,% + ... + ¢m&m € U. It 
is sufficient to verify that, ifxand y € U,thena +y¢ Uandarée U 
for any choice of the number a. On putting a = 0, we see that the zero 
element belongs to every non-empty lineal. A closed lineal will be 
called a subspace. It is easily seen that, if U is a non-closed lineal, the 
closed set U is a subspace, i.e. the closure of a lineal leads to a subspace. 
This follows from the fact proved above, that x + y and az are con- 
tinuous. If a set U is not a lineal, on forming all possible finite linear 
combinations ¢, 2, -+- ¢, 2%, + ... + ¢m2%» of elements x, € U, we get a 
new set of elements V, which will be a lineal. It is usually termed the 
linear envelope of U. This is the least lineal containing U. 

If x, 2, ..., 2%, are linearly independent elements, the set U of 
elements of B, expressible by « = c, x, + ¢,% + ... + ¢, 2%, with 
every possible choice of numbers c,, is obviously a lineal. It is easily 
shown that this lineal is a closed set (subspace). In view of the linear 
independence of the z,, the representation of x by the above formula 
is unique. Such a lineal is usually described as finite-dimensional. All 
the elements of U can be expressed by: « = c, y, + (2 Y, + -.- + 
+ cy Yr, where y, (Ss =1,2,...,k) are any linearly independent 
elements of U and ¢, are arbitrary numbers, and the number of terms 
is equal to k in any formula expressing an element of U in this form. 
This number is called the dimensionality of U. 

Notice that every B subspace is also a B space. 

We have already defined isometry for metric spaces. Let us define 
isometry for B spaces. Two such spaces X and X’ are said to be iso- 
metric if a one-to-one correspondence can be established between their 
elements such that: (1) if z and 2’, y and y’ are any two pairs of cor- 
responding elements of X and X’, then ax + by and az’ + by’ are 
also corresponding elements, for any choice of the numbers a and 6; 
(2) the norms of corresponding elements are equal. 

It follows from the above that the zero elements of X and X’ must 
be corresponding elements and that the distance between correspond- 
ing elements is the same in X and X’. 

There is no sense in distinguishing between isometric spaces from 
the point of view of abstract theory, and we shall write X = X’. 
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96. Examples of normed spaces. 1. All the spaces mentioned in [87] except 
for s and S are B spaces, if we put || x || = (0, x) for them. In the case of sequence 
spaces, multiplication of an element by a number a amounts, by definition, 
to multiplication of each number of the sequence by a, whilst addition amounts 
to addition of the numbers having the same index in the sequences: 


a(Ey, bo,..-) = (a8, aS, -.6)5 (Ex, Sa, - +) + (Ms Mas) = (€: +m, s+ Mg, .+-): 


In the case of function spaces, multiplication of an element by a number 
@ amounts, by definition, to multiplication of the function by a, and addition 
of elements to addition of the corresponding functions. The zero element in a 
sequence space is the sequence consisting of zeros, and in the function space 
to the function that vanishes identically (in C and V) or is equivalent to zero 
(in M, S and Ly). 

2. Let us take a bounded domain D in R, and the set C of functions 9(z) 
having continuous partial derivatives up to order / inside D, where the deriva- 
tives have limiting values on the boundary of D and are functions continuous 
in the closed domain D. For brevity, we shall say in this case that the function 
has derivatives continuous in D. The set of functions in question is a linear 
space. We introduce the norm into it: 


i 

e112 = max | D® gI, (42) 
x€D 
O<ksl 


where D* » denotes any kth order derivative. The maximum is taken over 
all x belonging to D, for the g(x) and its derivatives up to order J. It is easily 
seen that the norm satisfies the three basic conditions [96]. 

Convergence in C\) is a uniform convergence in D of a function and all its 
derivatives up to order 1. By Cauchy’s convergence test and the familiar theorem 
on term by term differentiation of a function sequence, we can say that, if the 
sequence of elements ¢,(x) € C) is mutually convergent, it is convergent to 
some element g(x) € Cl, ic. C is a B space. 


97. Operators in normed spaces. We have already defined operators 
in metric spaces X. New points arise in linear normed spaces. We shall 
assume that an operator A is defined on some lineal D(A) of the B 
space X, whilst the set of its values R(A) belongs to another B space 
‘X’. An operator is said to be distributive, if, for x, € D(A) and any 
numbers c;,, we have 


A (6,2 + Cg % +... 16m 2m) = Cy AX, + 0, Am, + ... + Cg Adm. (43) 


It is sufficient to show that A(c, x) = cAx and A(z + y) = Ax + 
+ Ay. It follows at once from (43) that R(A) is a lineal in X’ and that, 
if 9 is the zero element in X and 6’ that in X’, then A@ = @’. For, 
A(@) = A(0x), where x € D(A), but A(0x) = 0Az = @’. 
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We shall only discuss distributive operators below, defined on lineals. 
Let us recall the definition of continuity. An operator A is said to be 
continuous on an element 2, when the condition is satisfied: if x, 
(n= 1,2,...) and 2 € D(A) and 2,=>2, in X, then Ar, => Axy 
in X’. It is easily shown that, if a distributive operator A is continuous 
on an element y, € D(A), it is continuous on any element z, € D(A). 
Let z, and 2,€D(A) and z,=>2,; we have to show that Az,=> Azp. 
We form the elements yn = (2p — 29) + y, of D(A), where Yn => Yo. 
We have Az, = Az, + Ay, — Ay, and, since Ay,=» Ay, we have 
Az, => AZ. 

Thus there is no point in talking of continuity on an element of D(A); 
_ we must talk of continuity on the whole of D(A). A distributive oper- 
ator A is said to be bounded if there exists a positive number C such 
that, for any 2 € D(A): 


|| Az, || < Cle]. (44) 


Notice that the norm on the right is taken in X, and in X’ on the 
left. We shall prove that, for a distributive operator, boundedness 
and continuity in D(A) are equivalent. 

By what has been said, it is sufficient to consider continuity on the 
zero element 6. Let (44) hold. We shall show that, if z, € D(A) and 
In=> 6, then Az,= 0’. It follows from z,= 6 that, given any 
é > 0, there exists an WN such that. || z, || < «for n > N and, by (44), 
|| Az, |] < Ce for » > N, whence, since ¢« is arbitrary, we have 
Az, => 0’. Now let Az, = 0’ if t,=> 6, and let us prove (44). 
If x = 6, (44) reduces to {| 0’ || < C || @ ||, ie. 0 < CO, which is ful- 
filled (with the = sign) for any choice of C; thus it is sufficient to prove 
(44) for x #0. We use reductio ad absurdum. If (44) is not valid, there 
exists a sequence 2, € D(A) (|| %, || > 0), such that || Az, || = 
=C,,|| % ||, where C,— +0. On introducing the elements 2, = 
= (1/0, || Ln ||)%_,€ D(A), for which || 2, ||» 0, we get || Az, || = 1, 
which contradicts the continuity of the operator Az on the zero 
element. 

If A is the annihilation operator, i.e. Ax= 0’ for any x € D(A), we 
can put C = 0 in (44). We must have C > 0 for any other operator, 
and there exists a least positive C for which (44) holds. It is called 
the norm of operator A and is obtained from the formula: 


n, = sup || Az||. (45) 
|x|} =1 
x€ D(A) 
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The norm n, is also written as || A ||, so that we can write 
| Ax||<n,lle]] or || Az] < |] All|] el]. (46) 


The above remarks are exactly analogous to those made say in 
[IV; 36] for a particular case. 

THEOREM 1. If the lineal D(A) for a distributive bounded operator A 
is dense in X, A can be extended to the whole of X whilst preserving its 
norm and distributive property. Since D(A) = X, we can write any 
element x, € X asa limit z= 2%, where 2, € D(A). We shall show 
that a limit of Az, exists in X, which is independent of the sequence 
x, taken. In fact, 


|| Aa, — A%m || =|} A (%_ — Xm) |i <ALL 2 — Mm 


and, since 2%, => Yo, the right-hand side > 0 as n and m— +0; but 
now || At, — Atm || > 0, and a limit of Az, exists, since X’ is complete. 
It remains to show that the limit is independent of the choice of 
sequence. Let a, and x, € D(A), where %,=> 2%) and 2, => Zp. 
We have to show that the limits of Az, and Az, are the same. The 
actual existence of the limits follows from the above. 

It can easily be seen that the sequence 2,, 7}, Tj, 22, Y3, 73, -.. also 
has the limit 2). Thus the sequence Az,, Axj, Ax,, Ary, AX, Ax, ... 
has some limit y € X’. But now the subsequences Az, and Az,, have 
the limit y, i.e. the same limit. 

If 2, ¢€ X, but 2, € D(A), we take a sequence x, € D(A) and 
In => XL, and put Az, = lim Az,. We show that the operator thus 


N—Poo 


defined in X is distributive and that its norm is not increased on 
passing from D(A) to X. Let xj and a€X, x, and 2, be two sequences 
of D(A) having limits x and x9 (if e.g. xg € D(A), we can put all the 
Ly = %). On recalling that operator A is distributive in D(A), and 
that addition and multiplication by a number are continuous, we get 


A (€1 2% + 62 %5) = lim A (c, x, -+ c,%) =, lim Az, +c, an Az, = 
Ti+ 00 Ni oo n- 


/ “we 
= ¢, Ax) + c, Ax. 


The preservation of the norm follows from || Az; || < || A || || Zn ||, 
in which the || A || on the right is the norm of A in D(A), after a 
passage to the limit. The norm can obviously not decrease. The theorem 
is proved. 

The above method of extending A is usually called an extension in 
continuity. 
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Let us show further that the extension of A of D(A) onto X is 
unique in the ordinary sense: if B is a distributive operator bounded 
in X, coinciding with A on D(A), B must coincide everywhere with 
the extension of A in continuity. Let 2, € D(A), 2 € D(A) and 
Yn => Ly. Since B is continuous and coincides with A on D(A), 
we have 

Bx, = lim Bx, = lim Az, = Ax, 


N50 fl-> x0 


which is what we had prove. If the operator A is defined throughout 
space X, is distributive and bounded, we shall call it a linear operator 
(it is occasionally called a bounded linear operator in this case). If 
different Az of R(A) correspond to different x of the lineal D(A), an 
inverse operator exists, defined on H(A) and associating each x’ of 
R(A) with a unique element x of D(A), connected with 2 by the: 
relationship 2’ = Az. Since A is distributive, it follows at once that 
A-} js distributive. But the fact that A is bounded does not imply 
that A-1 is bounded. 

Let us take as an example the operator y = Af on the segment [0, 1] 
in space C: 


p(x) = { f(t) dt, (47) 


where we assume that X’ is space C itself. This is possible, since 
g(x) € C on [0,1]. Operator (47) transforms the whole of C into a 
lineal, consisting of the functions g(x) having a continuous derivative 
and vanishing at 2 = 0. There exists on this lineal of functions g(z) 
an inverse distributive operator f(r) = g’(x), but it is unbounded. 
For, the functions ¢,(x) = sin nazz belonging to this lineal have norm 1 
for any choice of the number n, whilst 9,(z) = na cos naz has norm 
nx, which increases indefinitely as n > ©. 

THEOREM 2. If B is a linear operator in X, R(B)c X and || B || = 
== a <1, the operator (E — B), where E is the operator of the identity 
transformation (i.e. Ex = x), has an inverse (EZ — B)-, defined on the 
whole of X, which is distributive and bounded. 

We consider the equation 


Y= xX — Ba (48), or t= Y + Bz, (49) 


where y is given and z is the required element of X. It is easily shown 
that the operator Av = y + Bz (A is the notation of [86]) satisfies 
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the condition for the principle of compressed mappings to be applic- 
able. For, 


|| Aw, — Ax, || = || B(x,—2,)|i}<alla,—a,||. (0<a<]l). 


Thus, equation (48) or (49) has a unique solution for any y € X, i.e. 
there exists an inverse operator x = (H — B)-1y, defined on the whole 
of X. It is obviously distributive. Let us show that it is bounded. 
We have 

|e — yl] = || Bel] <alja| 
and all the more 
1 


| Toa yl. 


whence it follows that the norm of (£ — B)~! is not greater than 
1/(1 — a). 

THEOREM 3. A linear operator maps a compact set into a compact 
set. Let U be a compact set of elements of X, x, (n = 1, 2, ...) bea 
sequence of elements of U and A a linear operator. We have to show 
that we can extract from the sequence Az, (n = 1, 2,...) a sub- 
sequence convergent in X’. The compactness of U implies the existence 
of a subsequence z,, having the limit: %,—» %) in X. Now, since A 
is continuous, we have Az,,=» Az, in X’, which is what we set out 
to prove. 

We shall state the following theorem without proof. 

THEOREM 4. [f a linear operator A is defined in a (B) space X, map- 
ping X one-to-one into the whole of space X’ (type B), the inverse 
operator A-} (defined in the whole of X*) is also linear. 


[—lyl|<al[zl], ie |z|l< 


Zz 


98, Linear functionals. Let us consider a real B space X (the 
elements of X are only multiplied by real numbers). An operator, the 
domain of values of which lies in real number space, is called a func- 
tional in X. The real number space is the real B space with the usual 
definition of addition of real numbers and of multiplication of them 
by a real number. The norm is the absolute value of the real number 
[87]. 

Everything said in [97] about operators also holds for functionals. 
A functional is bounded if 


[P(z)| <|[2[ fel}, (50) 
where | J(x) | is the absolute value of the real number /(x) and || J || is 


the norm of /(x). A linear functional is a particular case of a linear 
operator. Here, D(l) coincides with the whole of X. 
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THEOREM 1. I} a distributive bounded functional I(x) is given on a 
lineal U, it can be extended to the whole of X in such a way that I(x) 
becomes a linear functional in X with the same norm as in U. 

By hypothesis, in addition to the fact that U(x) is distributive, we 
have 

| U(x) | <|2llv |] (x€U), (51) 


where || 2 ||y is the norm of /{z) in U. We shall assume in the proof 
that space X is separable, which simplifies the arguments. The theorem 
still holds, however, for non-separable spaces. 

Since X is separable, there exists a denumerable set of elements, 
dense in X. We leave in this denumerable set only the elements that 
do not belong to U. If there are no such elements, U is everywhere dense 
in X, and we can extend /(x) in continuity on to the whole of X [97]. 
Otherwise, the remaining elements of the denumerable set can also 
be numbered: 2, %, Zs, .- 

We take the set U, of elements 2 of the form: z = y + tz,, where y 
is any element of U and ¢ is any real number. It is easilyseen that U,, 
like U, is a lineal. Let us show that the above expression of z is unique. 
If 2 has two different forms: 


z=ytta=—y' +t'x, (52) 


t # t’ in these forms, since if ¢ = ¢’, then y = y’. We show that ¢ # t’ 
leads to an absurdity. We have from (52): x, = (y’ — y)/(¢ — ¢t’), 
whence it follows that z, € U, and this contradicts what has been 
said. We now take any two elements 2’ and x” of U and establish an 
inequality. We have 


L(x!) —U(@") =1(@’ — 2") <|[U lu || 2’ — 2" |. (53) 


Notice that, if the left-hand side is a negative number, the in- 
equality is trivial. On observing that || 2’ — x” || = |[(2’ + 2,) — 
— (x%" + 2%) || < |] 2’ + 2 || + |] 2" + 2, |], we get by (53): 


Ua’) — |] Ello lle’ + a |] < Ue") + Elly |e” + & II 


On taking the strict lower bound of the set of numbers on the right, 
and the strict upper bound for the left-hand side, when 2’ and x” run 
independently over the whole of U, we get 


sup [2(x) — [[Zllu {e+ ay] < inf [2 (2) + Weliu le + all, 
x€U x€U 


where the right-hand side is obviously finite, i.e. so is the left. Thus a 
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real number a exists, satisfying 
sop hel | alles s lI]}<a er [2(@) + Ulu lle + 2, |I]. (54) 


We now extend I(x) from U onto U,. Let 2 = y + tz, be any element: 
of U,. We put 
L(z) == L(y) — ta, (55) 


where a is a fixed real number satisfying (54). If z ¢ U, then ¢ = 0, 
and 1(z) coincides with I(y), ie. (55) defines a functional on U, coin- 
ciding with the previous functional on U. We have therefore retained 
the previous notation for the extended functional. The fact that 1(z) 
is distributive follows at once from (55), the distributiveness of l(y) 
on U and the formulae 


z=y+tx,; ce=cy+ctx,; 
2! — y’ + t’ a 2" —_ y” -|- t’” X15 2! +f 2” — (y’ +. y”) -}- (f -f- t”) Bis 
We finally show that the norm of {z) in U, is not greater than || Z ||v 


(it cannot be lowered). We shall assume ?¢ > 0. On observing that d(y) = 
= il(yjt), where y/t ¢ U, we get 


1(z) = t[1(+-y) — a]. (56) 
But it follows from (54) that 
i 


and on replacing a in (56) by the smaller number on the right-hand 
side of this inequality, we get (¢ > 0): 


I 
U(2) <ell2|lu [y+ el] = Welle ly + ell =U elle lle 
We turn to the case ¢ < 0. It follows from (54) that 
1 1 
a<U(ty)+ (Ullu||-y +a 


and, on replacing a in the difference l(y/t) — a of (56) by a greater 
number, we get (|¢| = —1?): 


(j9)—4> — [lel y+ a]= — ple lly + eel =F lellelled- 


a>1(+y)—Wtlle |pyte 


On multiplying both sides of this inequality by the negative ¢ and 
taking (56) into account, we get U(z) < || 7|lu |}z2||. If t= 0, then 
z € U, and the inequality is obvious. Thus we have l(z) < || Z||u || 2 || 
for any z € U,. On replacing 2 in this inequality by (—z) and noting 
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that U(—z) = —U(2z) and || —2 || = || 2 ||, we get, —Xz) < |[ 2 [vu [I 2 |]. 
These two inequalities finally lead us to 


|2(z)| < [Jelly lz |i, (57) 


from which it follows that the norm remains unchanged with our ex- 
tension of I(x) from U to U,. 

We now turn to a further extension. If an element 2, of the above- 
mentioned sequence belongs to U,, we throw it away. If not, we 
extend /(z) as above from the lineal U, to the lineal U,, consisting of 
elements z = y + tz,, where y is any element of U, and ¢ is any real 
number. We proceed in this way, whilst retaining the previous nota- 
tion 21,2, ... for elements that have not been thrown away in our 
construction (there may be a finite number of them). We thus con- 
tinue I(x) on to the lineal V of elements having the form 


YC, + Cyt... +O, Xy, 


where y is any element of U, n is any positive integer (not greater 
than the number of elements 2z,, if this number is finite), and c, are 
any real numbers. This lineal V is dense in B, and the functional I(z) 
is distributive and bounded on it, with the norm ||Z||y. It now 
remains to extend J(z) on to the whole of X in continuity. The 
theorem is proved. 

A proof of Theorem 1 for the case of a complex B space may be 
found say in G. A. Sukhomlinov’s article (Matem. sb., 3, 1938), and in 
the book Legons d’ Analyse Fonctionnelle by F. Riesz and B. Sz. Nagy 
(Budapest, 1953; translation published by Ungar, N. Y., 1955). 

The theorem cannot be extended to operators. 

THEOREM 2. If x, is any fixed element of X, different from zero, a 
linear functional I(x) exists with unit norm ( || t|| = 1) such that I(x.) = 
= || Xp ||. 

We take the lineal U of elements of the form x = tz,, where t is any 
real number, and we define a distributive functional I(x) on U by the 
formula I(tx,) = t || x ||. When t= 1, we have I(x) = |! x || and 
\| 2 ||y=1. By Theorem 1, we can extend I(x) on to the whole of X whilst 
preserving the norm, and the theorem is proved. One consequence 
is that functionals with positive norms exist in every B space. 

Notice that the theorem also holds when x, = @ is the zero element. 
We only need to take any element z,, different from 9, and form in 
accordance with the theorem the linear functional /(x) such that 
|| 2 || = 1 and U(a,) = || 2, ||. For this U(z9) = (6) = 0 = || a ||. 
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99, Conjugate spaces. We consider the space X*, the elements of 
which are all possible linear functionals in the B space X. Functionals 
in X, i.e. real numbers, corresponding to an element 2, are denoted by 
I(x), m(z), n(x), ... Regarded as elements of X*, we denote them by 
the single letter: J, m,n, ... X* is a linear space. Addition and multi- 
plication by a number are introduced for functionals in the following 
natural way: 

(E+ m) (x) =I (x) + m (2x); (al) x = al (x), 
all the properties of axiom A being fulfilled. The zero element of X* 
is the annihilation functional, i.e. such that I(x) = 0 for any x € X. 
The norm / of an element of X* is taken equal to the norm of the cor- 
responding functional. This norm is > 0, where the = sign only holds 
for the annihilation functional. The two other properties of axiom 
C hold. 

The second follows from |J,(x) + 4,(z) | < | 4(xz) | + [4(2) |< 
<ALL @ A Ue et = 1 a te Ut 2 ID) I @ I], whilst the third 
is obvious. Let us show that X* is a complete space. Suppose we have 
a mutually convergent sequence of elements of X*: 


Un — bm || >O as n and m->oo. (58) 


We have to show that there exists an element 7 € X* such that 
||2—t, || +0 as n— co. We write 1, — ln =lnm € X*. We have 
In = lm + Iam and || Tn || < |! tm || + Il dam ||. By (58), there exists an 
N such that || lam || < 1fornandm > N. Having fixed m = m, > N, 
we get || dn || < || Um, {| + 1 for » > N, whilst there is a greatest 
among the finite number of non-negative numbers || /, || (n = 1, 2, 
..., N — 1). Hence it follows from (58) that there exists a C > 0 such 
that ||l,|| <C for all subscripts n. We have further: | /,(x) — 
— |ln(x) | < || tr — lm |! || & |], and it follows from (58) that | U,(z) — 
— I,(z) | > 0 as n and m-—» © with any choice of x € X. By Cauchy’s 
test (for numbers), the number sequence /,(z) has a limit. This limit, 
which we write as (zx) (1,(z) — U(x)) is some functional defined in the 
whole of X. Let us show that it is a linear functional. The equations: 
Ln(z + y) = 1,(x) + 1,(y) and l,(ax) = al,(x) show that it is distribu- 
tive, and ||, || <C, i.e. |l,(z)| < C || x || (by means of a passage to 
the limit as n —-> co) that it is bounded. Thus /(z) is a linear functional 
in X, ie. 2 € X*. It remains to show that |! 2 — J, || > 0. Given any 
e > 0, by (58), there exists an N such that | l,(x2) — 1,,(z) | < «|| x! 
for n and m > N. On passing to the limit as m — ©, we get | l,(x) — 
—Kx)|<e{l|a|| for n> N, ie. || 0 —1, || < ¢ for n > N, which 
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gives || 1 — 1, || > 0. This shows that X* is complete. Thus X* is a B 
space; it is called the conjugate to X. 

We introduce the second conjugate space X** = (X*)*, the elements 
of which are all possible linear functionals in X*. Space X** is obtain- 
ed from X* in the same way as X* from X, and X**is a B space. 

If we fix x € X, for any element / € X* there will be a corresponding 
definite real number 1(z), i-e. I(x) is a functional in X* for x fixed and 
J varying in X*. Let us denote it by the symbol L,(l). It follows from 
(2, +0,) (zw) = 1, (@) + 1,(x) and (al,) (x) = al, (x) that £,(1, + 1) = 
= L,(l,) + £,(,) and L,(al) = aL,(l), i.e. £,(l) is a distributive func- 
tional in X*, 

Since L,(1) = U(x), it follows from 


|2(x)| < |] 2] |e]. (59) 
that 
| Z,.() | < ila |{ ie, (60) 


where || 2 || is the norm of x in X and || 2 || is the norm of lin X*. It 
follows from (60) that the norm of the functional Z,(Z) in X* is not 
greater than || z||, 1.e. is boundedin X*. Thus Z,(J) is a linear functional 
in X*. Ifa = @isthe zero element in X, then L,(/) == (6) = 0 for any 
L¢€ X*, ie. L,(l) is the zero element in X**. It follows from (60) that 
the norm of L,(/) is not greater than || x ||. But by Theorem 2 of [98], 
given any z 4 @, there exists a functional l(z) such that U(x) = || z || 
and ||2|| = 1. For such an /, both sides of (60) are equal to || x || 
and the = sign holds, whence it follows that the norm of L,(1) is equal 
to ||a||. Further, it follows from IU(z, + x.) = Ux,) + Ux.) and 
Lc, @,) = ¢, U(x,) that Ly 4,1) = Ly, I) + Lx, (I), Lex, = 6 Lx, D 
and in general L,,x, 4.x, (1) = ¢, Dx, () + ¢ Ly, (l). In particular, Lx, 
(1) = Ly, (l) — Ly, (1) whence it follows, in view of what has been 
said about the norm, that |] L,, — Lx, || = || %, — ||, which shows 
that distinct elements L, of space X** correspond to distinct 2. 

The following important proposition is a consequence of the above. 

THEOREM. We can associate with every element x € X an element 
Ly, € X**, In this correspondence, distinct x correspond with distinct L,, 
addition and multiplication by a number in X correspond to the same 
operations for the corresponding elements in X**, and the norms of cor- 
responding elements in X and X** are the same. 

This proposition enables us to identify L, with 2, ie. to ambed X 
in X**, which is written as X Cc X**, In other words, X is isometric 
with part of X**. 
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We shall see later that X is isometric with the whole of X** in 
certain cases, i.e. X** = X. Such a space X is described as regular. 
The notation (2, x) or (zx, 1) is often used instead of U(x) = L,(I): 


(¢,1) = (4) =1(@),; (61) 
and (a, 1) is called the inner product of the element 7 € X* and the 
element « € X. The elements / and @ are said to be orthogonal if 
(z, 1) = 0. 

We can write (59) as 

| (2,2) | < |Je|] el, (62) 

and it follows from what has been said that 
(ax, bl) = ab(x,l); (a, + wg, 1) = (#1, 1) + (a, lL); 
(x, 1, + 1,) = (%, 4,) + (@, 4), 


where a and b are any real numbers. 

We have so far considered real B spaces. Everything said above 
still holds for complex B spaces (multiplication of an element x by 
arbitrary complex numbers). In this case functionals may take any 
complex values. In future, we shall write a as previously for the 
complex conjugate of a (a = a + bi;a = a — bi). Notice that, if U(x) 
is a linear functional in B, d(x) is also bounded in X, but is not a linear 
functional, since U(x) is multiplied by ¢ when z is multiplied by the 
complex number c. The conjugate space X* is defined as the set of all 
U(x), not. of all U(z). 

Addition of elements of X* and multiplication by a complex number 
are ordinary addition and multiplication by complex numbers. The 
norm || 7 || is taken equal to || J ||, and | U(x) | < |[2]| - || 2 ||; it is not 
possible to replace |{ 7 || by a smaller number. X* is a B space. The 
inner product is defined by the formulae 


— —s 


Zefa tlinls ap =—(6e)—1i2). (63) 
Here, for any complex a and b: 
(al, bx) = ab(l, x), (64) 


The space X** = (X*)* has the same connection with X as in the 
case of a real space. 


100. Weak convergence of functionals. We considered in [99] the 
convergence of a sequence of linear functionals /,(x) to the functional 
U(x): || 1 — ty || + 0 and all the more, 1,(x) — U(x) for any x € X. This 
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is usually described as a convergence in norm. It follows from [99] 
that the norms of the /,(x) are bounded, i.e. do not exceed some posi- 
tive number for any n. Let us now introduce a new concept of con- 
vergence. We say that a sequence of linear functionals /,,(2) is weakly 
convergent if, given any z € X, the sequence /,,(x) has a (finite) limit. 
Let us write U(x) for this limit. This is a functional defined in the whole 
of X. It is distributive because /,(z) are distributive, and if we knew 
that the sequence || /,, || is bounded, we could say that I(x) is bounded 
(and therefore linear). This is in fact the case. 

THEoreM 1. Let L be a set of linear functionals U(x), where there exists 
for any element x a positive number m, such that | (x) | < m, if l € L, i.e. 
given a fixed x, the set of numbers | (x) | is bounded. The norms of the 
functionals U(x) (1 € L) are bounded. 

We show first that the theorem can be proved, simply by showing 
that | U(x) | is bounded in any sphere. In fact, let there exist a positive 
Hainer b such that 


JE(z)|<b — (1EL), (65) 


if x belongs to some closed sphere S(2p, a) ( || * — Xp || < a), and let y 
be any element of X differing from zero. The element 7 = ay/|| y || + 2% 
belongs to S(z,, a), and by (65), we have 


SAC (<b 
and all the more 
a 
rom |?) | — | 4) | <2. 
whence 
rs) I 26 ‘ 
[2(y)| < PEG yy 2 yy (66) 


for any y € X. i.e. || 1 || < 2b/a, which in fact amounts to the assertion 
of the theorem. Hence, if the set of numbers | 1(x) | is bounded in any 
sphere, the theorem is proved. Let us now prove the theorem by 
reductio ad absurdum. Suppose that the set in question is unbounded 
in any closed sphere; we show that this leads to a contradiction. 

We fix a closed sphere §,. By what has been proved, there exists an 
element «, € S, and a functional 1, € L, such that |1,(z,) | > 1. Since 
1,(z) is continuous, we can assume that 2, lies inside S, and that the 
inequality 1,(z) > 1 is satisfied throughout S(x,,7,), where 7, is a 
sufficiently small positive number. As above, there exist an element 
%, lying inside S(z,,7,), a functional J, € S and a small positive 7, 
such that the sphere S(z,, 7,) belongs to S(x,, 7,) and | d,, (w_) | > 2 in 


100] WEAK CONVERGENCE OF FUNCTIONALS 297 


the whole of this sphere. On proceeding in this way, we get a sequence 
of embedded spheres 


8 (21,71) DS (ag, 7%) DS (xg, 73) D--- 


and of functionals J, € LZ such that | J,(x) | > kin the whole of S(zx, 7,). 
It can naturally be assumed here that 7, — 0 as k co. Thus | 1;(29) | > 
> k at the point x, belonging to all these spheres [85] which contra- 
dicts the fact that the set of numbers | /,(z,) | must be bounded. The 
theorem is proved. 

If a sequence of functionals /,(x) has a finite limit for any 2, the 
number sequence |J,(x) | is bounded for any 2, and by the theorem, 
the sequence || J, || is bounded. As we remarked above, it follows 
from this that the limit U(x) of a weakly convergent sequence of linear 
functionals is also a Jinear functional. 

If the sequence || /, || is bounded, the existence of a limit of [,(z) 
only on a lineal dense in X, and not on the whole of X, proves to be 
sufficient for the weak convergence of the functionals. 

THEOREM 2. The necessary and sufficient condition for a sequence of 
functionals 1,,(x) to be weakly convergent is that the sequence || 1, || be 
bounded and that there exist a limit of 1,(x) on a lineal dense in X. 
The necessity of the first condition follows from Theorem 1, whilst 
the second is obvious. We turn to the proof of sufficiency. Let U 
denote the lineal mentioned in the second condition. By the first 
condition, || J, || < C, where C is a positive number. On writing l(z) 
for the limit of 1,(x) for « € U, we can say that I(x) is a distributive 
functional bounded on U (its norm does not exceed C). We can extend 
it in continuity on to the whole of X. We first use l(z) to denote the 
linear functional thus obtained, and show that I,(x)— U(x) for any 
xé€X. If x€ U, this is true. Let x € U. Given any ¢ > 0, there 
exists an element 2j € U such that || 7 — 24 || < «/4C. On recalling 
that the norms of l,(2) and U(x) do not exceed C, we can write 


| 2 (a0) — Uy (20) | < [2 (wo) — L (ag) | + | L(ax3) — b, (26) | + 
+ [Un (206) = Un (#0) | < ffLI || @> — weil + | E (@G) — bn (5) | + 
+ fbn ll [0 — tall < + | Ea) — by (20) | 


But ,(x%o) + (zg), 80 that there exists a subscript N such that 
| U(ag) — ln(xg) | < «/2 for » > MN, and from the previous inequality: 
| 2(2%9) — Un(2) | < ¢ for n > N, whence it follows that 1,(z9) > 1(2,). 
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Note. The second condition of the theorem can be replaced by 
the following: there exists a limit of /,(x) on a set of elements V, the 
linear envelope of which U (a lineal) is dense in X. For, since /,,(2) is 
distributive, it follows from the convergence of /,(z) on V that 1,(z) 
is convergent on U. 

The concept of the weak convergence of functionals leads naturally 
to the concept of weak compactness. A set W of elements of X* is said 
to be weakly compact if we can extract a weakly convergent subse- 
quence from any sequence of functionals J, € W. 

THEOREM 38. If X is separable, any bounded set of functionals (|| 1 || < 
<r(r > 0)) is weakly compact. 

We have to show that, if the norms are bounded for a sequence of 
linear functionals /,,(z): || ln || < 7, we can choose a subsequence /,,(z), 
convergent for every x € X. Let x, x,, ... be a denumerable set V of 
elements of X, dense in X. Given any m, we have | 1,(%m) | <7 || 2m ||, 
i.e. the sequence of numbers J,(%,,) (n = 1, 2, 8, ...) is bounded. On 
applying the usual diagonal process [IV; 15], we form a subsequence 
1,,(z), convergent on all the elements of V.It follows from the note 
on the previous theorem that the sequence is convergent on the whole 
of X, and the theorem is proved. 

Note. Every bounded set of elements of X* is obviously also 
weakly compact, since it can be included in some sphere || / || < 7. 


101. The weak convergence of elements. We now introduce the 
concept of the weak convergence of elements of the B space X. We 
say that a sequence 2, of elements of X is weakly convergent to an 
element 2), and write z,-% a, if l(a,)+U(a) for any linear functional 
U(x). The element 2, is called the weak limit of z,. Let us show that a 
sequence {z,} cannot be weakly convergent to more than one limit. 
In fact, if In Zy and 2,—%Yp, we have by definition: U(r,) > U(x) 
and U(x,) —> U(y,) for any 1 € X*, whence l(yy) = 1(2,) or Uy) — 2%) = 0 
for any 1 € X*. But if y, 4%, ie. Yy — % is not the zero element, 
there exists an element / € X* such that l(yy — %) = || Yo — % || > 9, 
which contradicts what we said above, and our assertion is thus proved. 
If 2,25, it is obvious that every subsequence Fig ay. If tn —% 2%, 
Yn Yo: and Gnp—> Qo, then Gp XS Gy %, Za + Yn—>%y + Yo: This 
follows at once from the distributive property of the functionals. 

Convergence in norm, || % — % || > 0, which we wrote above as 
Zn => %, is sometimes called strong convergence. We have simply 
called it convergence. In view of the continuity of a linear functional, 
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it follows from 2, => 2% that U(x,) > Uz») for any 1 € X*, ie. weak 
convergence follows from strong convergence. 

Strong does not in general follow from weak convergence. Let us 
give an example. We take the space L, on the segment [0, 1]. As the 
sequence 2, we take the functions sin nat (n = 1, 2, ...). 

As will be shown below [102], the general form of a functional in 
L, [0,1] is given by 


where /(é) is a fixed function of LZ, [0, 1] and 2(¢) is any element of this 
space. In particular, we have for the elements 2,(¢) = sin nat: 


1 
L(a,) = § f(t) sin natdt, 
0 


whence it is clear that, discounting the factor )2, I(z,) are the Fourier 
coefficients of f(t) with respect to the system sin nxt on [0, 1]. We know 
that U(z,)—» 0 with any choice of f(t) of LZ, [0,1], i.e. Uap) > (8), 
where @ is the zero element of L, (0, 1] (the function equivalent to 
zero). Thus sin nat % @ as n—> co in L, [0, 1]. At the same time there 
is no strong convergence, since 


1 
||@ — x, ||? = { sintnaédt = 5 ; 


TarorEM 1. If tp“ a, the sequence || Xp || ts bounded. 

We can regard 2, and 2, as elements of X**, It now follows from 
In 2 that the functionals in X* corresponding to a, are weakly 
convergent to the functional corresponding to x. But, by Theorem 1 
of [100], the norms of these functionals, equal to || 2, ||, form a 
bounded set, and the theorem is proved. 

The weak compactness of a set of elements of X is defined like the 
weak compactness ofaset of functionals. Every bounded set of elements 
x (|| a|| <C) is a bounded set in X**, but this latter set is weakly 
compact, as a set of functionals in X*, when X* is separable. If X is a 
regular space, i.e. X** = X, it follows from what has been said that 

THEoREM 2. If X ts regular, whilst X and X* are separable, every 
bounded set of elements of X is weakly compact. 

Notice that, if X is not a regular space, i.e. X** is wider than X, 
the limit of a sequence of elements of X** can be an element of X** 
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for which there is no corresponding element of X. It can be shown that, 
if X is separable and regular, X* is also separable. 

An immediate consequence of Theorem 2 of [100] and the above 
correspondence of elements of X** to elements x € X is: 

THEOREM 3. The necessary and sufficient condition for a sequence Xp 
of elements of a regular space X to be weakly convergent is that the se- 
quence || x, || be bounded and that there exist a limit of I(x,) on some lineal 
U of elements 1 € X*, dense in X*. 

As in [100], the lineal U can be replaced by a set V of elements of X* 
whose linear envelope is dense in X*. 

THEOREM 4. Let A be a linear operator in the B space X and R(A) 
belong to the (also B) space X’. If tp,“ ty in X, then Ar, “> Ax, in X’. 

Let m(y) be any linear functional in X’. It is easily seen that m(Az) 
is a linear functional in X, and Tend, in X implies m(Az,)—» m Azy). 
This holds for any functional in X’, so that Az,“ Aa, in X’, and the 
theorem is proved. We know that a linear operator is continuous in 
the sense of strong convergence. Theorem 4 asserts that it is continuous 
also in the sense of weak convergence. 

We have seen that, if z,=> 2, then || ap ||—> || 2 || [95]. This 
property may not hold for weak convergence. To return to the above 
example of the sequence of functions sin naz of L, on [0, 1], we have 
seen that sin nazz“ @, whilst || sin nxz || = 1///2. 

THEOREM 5. If x,\a,, then || 2 || < lim || ap ||. Notice first of all 
that lim || zp || is finite by theorem 1. We use reductio ad absurdum. 
Let || 2 || > lim || 2, ||. We take a number m satisfying the inequality 


| 25 || > m > Tim [je |. (67) 

It follows from this that an N exists such that || z, || << mforn > N. 
Further, there is a linear functional (x) such that l(z,) = || x || and 
|| || = 1 [98], and we have |U(ap)| < || 21] + [len], te. | en) | < 


< || an || < m for n > N, whilst, by (67), Ua) = || x || > m. 

Thus /(z,,) does not tend to l(z,), which contradicts the hypothesis, 
and the theorem is proved. 

Suppose that space X satisfies the following condition: given any 
6 > 0, there exists a number 7 > 1 such that, if || x || = || y || =1 


1 
and ||z— y|| = 6, then eer ol < 7. We say here that X is 


a uniformly convex space. The following assertion holds for such 
spaces: if %,“sa, and || 2pm || > || Zo ||, then z= %. We shall prove 
this later for the particular case of Hilbert space. The property of 
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uniform convexity holds for spaces L, with p> 1 (see, e.g., S.L. 
Sobolev, Some Applications of Functional Analysis to Mathematical 
Physics (Nekotorye primeneniya funktsional’nogo analiza v matema- 
ticheskoi fizike)). 

THEOREM 6, If 2, “. x, then 2, belongs to the closure (in norm) of 
the linear envelope of the set of elements x, (n = 1, 2, ...). 
We use reductio ad absurdum.. Let U be the linear envelope of elements 
Xn, and let x, not belong to U, ie. 

inf || 2, —y||=d> 0. (68) 


y€U 


We have for any non-zero number ?: 


|| ty + y|| > | é|d. (69) 
For, 


Itz + afl =[t1-fre — py |. 


But if y € U, then (1/—t) - y¢U, and (69) follows at once from (68). 
We now take a set of elements of the form 


z=tz+y, (70) 


where y € U and tis any number. As in [98], it is easily seen that the 
expression of x in form (70) is unique, and that the set of elements 2 
in question is a lineal. We write it as V and define a distributive 
functional on it by the formula U(x) = ¢, so that U(y) = Oif y € U. We 
show that this functional is bounded on V. Let ¢ 4 0. By (69), we can 
write | ltt, + y)| =|t| < I/d||tz, + y||. This inequality is ob- 
vious when ¢ = 0. 

Thus |! 7 || < 1/d on V. We can extend J on to the whole of X with 
the same bound for the norm and obtain a linear functional I(x). By 
definition, 1x9) = 1 and U(z,) = 0, since all xz, € U. We see that U(zx,) 
does not tend to I(x), which contradicts the hypothesis: x, “ xy. 
The theorem is proved. 

The last theorem can also be stated as follows: if z, “% Xo, there 
exists a sequence of linear combinations of elements 2p: cy % -+ Cg % + 
+...4+ Ge tn, (kK = 1, 2,...), which is strongly convergent to 2»: 
re ae a a Ch, Try => Ly as k + c9, 

The stronger assertion can be proved: if 2, “+ x,, there exists a sub- 
sequence Zp, (K = 1, 2, ...) such that 


1 
ye (Xm + Eng +++ + Ly) => Xy as b> co, 
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THEOREM 7. If space X is regular and the sequence x, €X is weakly 
mutually convergent, i.e. Uxtp) — Um) = Up — Lm) —> 0 as n and m > 
co for any element 1 € X*, the sequence x, is weakly convergent. 

It follows from the hypothesis and Cauchy’s test for numerical 
sequences that l(z,) has a limit for any 1 € X*, i.e. the linear functionals 
L, (l) = Uxn) in X* have a limit for any / € X*. This limit is also a 
linear functional in X*. But X is regular, so that this limiting func- 
tional has the form L,, (1) = U(x,), i.e. for any 1 € X* we have U(x,) > 
—> I(a,), ie. t, “% x5, which is what we had to prove. 

Theorem 7 can be stated alternatively as: a regular space X has weak 
completeness. 


102, Linear functionals in C, L, and 1,. 1. We know that the 
general form of linear functional in C on the finite interval [a, 6] is [15]: 


b 


Uf) = J f(x) dg(a) , (71) 


a 


where g(z) is a function of bounded variation, continuous from the 

right and satisfying the condition g(a) = 0, where distinct functions 

g(x) with the indicated properties generate distinct functionals I(f). 
b 


We also know that || 2 || = Vg(x). We can therefore associate each 

lf) in C with a function g(x) with the indicated properties and with 
b 

the norm || g || = Vg(x) and identify space C* with the space V of 


a 
these functions. Space V is of type B [96]. 

We now consider functionals in C*, i.e. in V. Such a functional is 
given by (71) for any fixed function /(x) continuous in [a, 5]. 

Thus space C is embedded in C**. Let us show that not every 
functional in V is expressible by (71). We take as the functional /,(/) 
in V the sum of the jumps of g(x) and show that it is not expressible 
by (71) whatever the choice of continuous function f(z). We form the 
following element of V: 


<K<ax<ce 
Gy(%) = 1 for c<a<b. (c >a). 


If we were to have (71), we should get 1,(g,) = /(c). But the sum of 
the jumps of g,(z) is unity, so that /(c) = 1, i.e. the continuous function 
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/(z) = 1. But now (71) gives 1,(g) = g(6) — g(a), and this difference 
is not the sum of the jumps for every function g(x) € V. Hence C** 
is wider than C, i.e. C is not a regular space. Everything that has been 
said refers to real functions, but can also be extended to complex 
functions, 

2. We now establish the general form of linear functional in space 
L, (p > 1) of real functions on the bounded measurable set 2) of Rn. 
Let U(f) be such a functional and w,(x) the characteristic function of 
any measurable set belonging to #5. Obviously, w,(x) € L,(F 9), 
and we can write 

Uws) = F(@). (72) 


Let us show that this set function, defined for all measurable J of 
&y, is completely additive. Let 7 = J, 4+ %,-+ ..., where the mea- 
surable sets @;, are pairwise disjoint. The series 


2, (z,(2) (73) 
is convergent in L,(@,). For, 


xs ao]? -|> me)? , 


Lplés) bs ék 


> s,(2) 


k=q 


and the last sum tends to zero as r and g—> ©, since the measure is 
completely additive. Series (73) is convergent to w,(x) at each point 
xz of @. 

Consequently it is also convergent to @,(x) in L,(@ >) [62] (or to 
an equivalent function), i.e. 


wy() = > ws,(x) , (74) 


where the convergence can be understood as a convergence in L,(@ »)- 
In view of the continuity of the functional U(f) in L,(% 4), (74) gives us 
F(@) = F(@,) + FE.) + ..., ie. F(@) is completely additive. If F” 


is a set of measure zero, then 


1 
[fwete)]| <]2l]+|jor@ |] <l\tll [ faz]P = 0. 
Lp(r) £ 


i.e. I[w,(x)] = 0, if m(f’) = 0, and the theorem of [73] gives the form 
for F(Z): 


F(@) = j (a) da , 
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where we can say for the moment, as regards (x), that it is summable 
on ,. We have thus shown that 


I[ce(x)] = J p(x) we(x) dex . 
Since l(f) is distributive, we have 
Up) = s pla) pla) daz (75) 


for any bounded function ¢(z) with a finite number of values. We shall 
now show that the formula holds for any bounded g(x) measurable 
on @, (such a function belongs to Z,(%,)). We first suppose that 
g(x) > 0. By Theorem 1 of [46], there exists a sequence of functions 
n(x) with a finite number of finite values such that g,(x) > o(x) 
uniformly on @,. The ,(z) are therefore bounded on @, by the same 
number. We have for the ¢,(z2): 


Un) = J (x) p_(a) dar . (76) 


It is obvious that p(x) => (x) in L,(@), and we can pass to the 
limit under the integral sign [54] in the last formula. The continuity 
of the functional [(g) leads to (75), which is thus established for any 
non-negative bounded (x) measurable on &,. The case of a bounded 
function of any sign reduces at once to the case discussed by writing 
p(x) = yt (xz) — p-(x), where p*t(x) and p-(x) are the positive and 
negative parts of g(x). We now show that »(z)¢€LZ,(%,), where 
(1/p) + (1/p’) = 1. On substituting in (75) the bounded measurable 
function g(x), defined as follows: 


ve mn we for I 


NP san p(x) for | y(x)| > N, a 


p(t) = 
where 
1 fora>oO, 
sgna=)—1 fora<0, 
0 for a=0, 
we get 
Up) > S| ex) Pde, (78) 


since | p(x) | > | p(x) |/?~” and p’/(p’ — 1) = p. 
On the other hand, 


Up) <|j2l-[ell=el LJ | p(x) Pda]? , 
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and we have by (78): 
1 
Jie) )\Pda < <UL) lee )|Pdax]P, 


whence 
1 
[S| pa) Pde]? < 


But it follows from (77) that 


5 fee) for | p(x) |< ¥ 
| p(x) |? = NP’ for | p(x)| > 


so that 


1 
[S| vey Py? < ie, (79) 


where p,(2) is the cut-off function y(xz). Hence it follows that p(x) € 
€ L,{% 4), and that 
2 |] > | P [lepen - (80) 


Now let (x) be any function of L,(@ 9). There exists a sequence of 
measurable bounded functions ,(z), which tends to g(x) in L,(% 4). 
In view of the continuity of the functional: l(¢,) — U(y), and by virtue 
of [62]: 

Av p(x x) dx — J ve p(x) p(x) da. 


We have (75) for y,(z), and it follows from what has been said that 
(75) holds for any g(x) € L,(@). By Hélder’s inequality: 


1 
Me) |= LS i ve) de] plingeo 


NEI < iP lees» (81) 


which gives, in conjunction with (80): 


2 {| = Uh? Ulepee - (82) 


Thus every linear functional in L,(@ 9) is expressible by (75), where 
p(x) € Ly, (&,), and (82) holds. 

Let p(x) be any fixed function of L,.(@ 4). In view of Hélder’s in- 
equality, (75) yields a linear functional in L,(@,), the norm of which 
satisfies inequality (81), ie. (75) is the general formula for a linear 
functional in L,(@,). Since p(p’ — 1) = p’, the function g(r) = 
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= | p(z) | ~* sgn y(z) belongs to L,(%,), and we can substitute this 
(x) in (75): 


I[| p(x) |P’ sgn p(x)] = | | p(2e) |?” dar. (83) 


The norm of g(x) = | p(z) |? ~? sin p(x) in L,(@ 9) is equal to: 


[lo paa] = [J lw@) Pde], 
and we get from (83) 
Jive) Prax < {ZL Sl y@) Pl? 


whence 

[2 > |] eee) lene » 
which, in conjunction with (81), again yields (82). Thus formula (75), 
where (2) is any function of L,.(% 4), gives the general form of a linear 
functional in L,(@), where (82) holds. 

Equivalent functions yp(z) obviously yield the same functionals 
{coincident on all p(x) € L,(%,). Let us show that non-equivalent 
w(x) yield distinct functionals. This obviously amounts to proving 
the following: if p(x) € Ly(% and we have for any g(x) € L,(@,): 


J vole) x)dx = 0, 


then y,(z) is equivalent to zero. On putting o(xz) = | p(x) |P'~* sin w, (2), 
‘we get 


J lvoln) [Pax =o, 


whence it follows at once that y,(z) is equivalent to zero [51]. 

Let us now consider space L,(%..), where &,, is the entire space Ey. 
As above, it may be shown that, if p(x) € Lp(%.,), (75) defines a linear 
functional in L,(@.,), and equation (82) holds. Let us show that every 
functional in Z,(%..) is expressible in the form (75), where p(x) € 
€ L,(&%..). We take the functions y(zx) of L,(%.) which vanish outside 
the interval 4, (—m < a < +m;k = 1, 2,...,n). They form the 
space L,(4,,). The functional (py) for such functions on L,(@.,) is 
also a functional on L,(4,,), and its general form is 


= FE VmlX) plac) dae 


where yn(z) € Lp(4m), where || pm{%) Ilz,14,) < Il el. It follows at 
once from what has been said that y4,(2) and y,,(z) are equivalent 
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on A, for k > 0. We thus obtain the function p(x) € Lp(%..), equi- 
valent to w(x) on A, and we have 


= P| p(x) 9(2) da 


Since finite functions are everywhere dense in L,(Z..), we can conclude 
that everything said above for L,(@) also holds for L,(@..). The 
results obtained are readily extended to the complex space L,(@ 9), the 
functionals being also capable of taking complex values. It follows at 
once from what has been said that the space L}(%,) can be identified 
with L,(%,), and hence with Lp*(%,), ie. L3(%y) coincides with 
L,(@o). In other words, given a fixed (x) € L,(&,), the right-hand 
of (75) gives the general form for a linear functional in L,(%,), with 
norm equal to || @ ||: ¢¢,). Thus L,(@ 9) is a regular space. Since L,(% 9) = 
= L(y), and L, (F ) is separable, it can be asserted that every sphere 
in L EF o) (or every bounded set) is weakly compact. When p = 2 we 
have p’ = 2; i.e. L}(%,) is L,(&,). We shall consider this case in detail 
in a subsequent chapter. Everything said above also holds for L,(@..). 
On using the above formula for a linear functional in L,(@ 4) (p > 1), 
we can prove the following theorem: 

THEOREM. If y(x) is measurable on a bounded measurable set &, and 
p(x)p(x) is summable on &, for any (x) €L,(%) (p> 1), then 
w(x) € Ly (& 4). 

It follows at once from the hypotheses that p(x) can take an infinite 
value only on a set of measure zero, and we can assume that p(x) only 
takes finite values. We define the function sequence: 


Yq (tt) = ee yar lee) (84) 
n if | p( 


x) 
which tends to (zx) at every point z. If p(x) is any function of L,(%,) 
then | pn(z)p(x) | < | p(x)p(z) |, where p(x)g(x) is summable on Z, 
by hypothesis. Hence it follows that 


lim { p, (x) p(w) da = { p(x) g(a) da. 
SS Be & 


But the y,(x) are bounded functions, i.e. belong to L,(%,), and the 
integrals on the left-hand side are linear functionals of g(x) in L,(&@4). 
They have a limit on any element g(x) € L,(#,), so that their norms 
are bounded by some number A [100]: 


§ | vn (a) Pda < A”, 
& 
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whence we get in the limit [54]: 
§ | (a) [Pda < AP’, 
& 


which is what we wanted to prove. 

The case p = 1 is singular. It can be shown that the space Lf is 
isometric with M (the space of measurable bounded functions) and 
L, is not a regular space. 

3. Let us now consider linear functionals in 1, (p > 1). Let U(x) be 
such a functional. Given any elements 2(£,, ,, .. .) of space ly, there is 
a corresponding cut-off element 2p(&,, &,..-,€n,0,0,---), and 
In => %, Since the series with the general term | ¢, ? is convergent. 
We introduce the elements y, (é@, ef, ...) (& = 1, 2, ...) such that 
é — Ofori # kand &@ = 1. Let lU(y,) = ax. Since U(x) is distributive, 


we have Uap) =a, &, + a,6 +... + 4nén, and we obtain, on 
making use of the continuity of I(x): 
i(x) =a,é,+4,6 4+... (85) 


Let us consider the numbers a,. We introduce the elements zy 


(i, ny”? ..-) on J, as follows: 
(N) _ | @ |?’ sgn a, for k< JN, 
om 0 for KS, 
We have 
Ni f 
L (zy) = > | &%l? 
k=! 
and 


N N 7 
Sa! = ey) < Ua edie tll Pla 
k=l = 

whence 
N Bs 
| > | ay Pe <||2\ 
k=1 


and in the limit as N— oo: 


~ i 
| louie < 1h 
k=l 
i.e. 0(@,, dy, .--) € lp, and 

|) Ollie. < [2 II. (86) 

Further, Hélder’s inequality, applied to the sum (85), shows that 

211 < |] |h,, and we obtain, by (86): 

Heil = ev (87) 
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It can be shown, precisely as for L,, that (85), where v(a,,a., ...) 
is any element of 1,., gives the general form of a linear functional in /,, 
where the element v is uniquely defined by the functional I(x) and (87) 
holds. Hence it follows that Jf is l,, and that 1, is a regular space. 
A theorem holds, precisely analogous to the theorem proved above: 
if the series 


> bays 
k=l 


where 6, are fixed, is convergent for any choice of (a,,a,, ...) € lp 
(p > 1), then (9,,92, ...) € Up. 


103. Weak convergence in C, L, and /,. 1. The weak convergence 
of elements f,(z) € C to the element f(x) € C (in a finite interval 
[@, b]) is defined by 


b b 
lim f f, (x) dg (x) = J f (x) dg (x) (88) 


Nn oo 


for any function g(x) of bounded variation. The necessary and sufficient 
conditions for this convergence are: (a) there exists a C > 0 such that 
[fn{z)| <C (n = 1, 2,...); (b) fa(z) > f(x) for any 2 € [a, ]. 

Condition (a) follows directly from [101]. Further, if = g, is any 
fixed value of [a, b] and f(x) is any element of C, 1,(f) = f(2,9) is evi- 
dently a linear functional in C, and, since f,(x)-“ f(x), we must have 
Lo(fn) —> Up(f), ie. fn(%o) > f(%)). We now have to show that (88) follows 
from (a) and (b) for any choice of function g(x) of bounded variation. 
Since | f/,(z) | < C and f,(x) > f(x), passage to the limit is permissible 
in (88) if we regard the integrals as Lebesgue—Stieltjes integrals [54]. 
But, since f(z) and f,(x) are continuous, these integrals can also be 
regarded as ordinary Stieltjes integrals. 

Notice further that, by the theorem of [101], when conditions (a) 
and (b) are observed, there exists a sequence of linear combinations 
of f,(z) which tends to f(x) uniformly in (a, }b]. The functions f,(2) 
and f(x) are assumed continuous, as indicated above. 

Notice that the non-regular space C is not weakly complete. This 
corresponds to the fact that the limit of the sequence /,,(z) of continuous 
functions convergent at every point of [a, b] and jointly bounded 
(| fn(z) | < m) may not in fact be a continuous function. 
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2. The weak convergence of elements 9,(r) € Lp(%o) (p > 1) to an 
element (x) € L,(@,) is defined by 


Jim | p(x), (x) da = | p(x) p(x) da (89) 
Hoe ¢, &5 


for any function p(x) € £,(%>). By Theorem 3 of [101]. the necessary 
and sufficient conditions for weak convergence in L,(@,) can be 
stated as: (a) the norms of the ¢,(z) are bounded, i.e. 


[Jl (x) P dep aC, (90) 
&e 

and (b) (89) holds on the set of elements p(x) of L,(&%), the linear 
envelope of which is everywhere dense in L,(@,). When (90) is ful- 
filled, it is sufficient say that (89) hold for all the characteristic func- 
tions w,(x) of measurable sets J appearing in #, (@, is a bounded set). 
When @;, is a one-dimensional finite or infinite interval, it is sufficient 
that condition (90) and the equations 


g é 
lim { y, (x) da = { y(x)dz, (91) 
ncaa - ¢ 
be satisfied, where ¢ is any fixed number of the interval and é is an 
arbitrary point of the interval. 

3. The weak convergence of elements (eee cae lL, (p > 1) 
to an element (&,, 6, ...) € lp is defined by 


lim (5, + 6, +...) = b,é, +5,6, +... (92) 
T-+ oo 


for any element (0,, b,,...) € lp. The necessary and sufficient con- 
ditions for weak convergence are 


oo 


D>lePr<cr (93) 
k=1 
HM + (k=1, 2, ...). (94) 


Condition (93) is the usual one, whilst the necessity of (94) follows 
from (92) if we take b; = 0 for i # k and &, = 1. Let (93) and (94) be 
satisfied. It follows from (94) that (92) is satisfied on elements of the 
form (0,0, ..., 0,1, 0,0, ...) (the base vectors of space /,-). But the 
linear envelope of the base vectors is dense in 7,,, since the cut-off ele- 
ments, all the components of which vanish as from a certain number 
(proper to each element), are dense in /,- [59]. Thus the sufficiency of 
(93) and (94) follows from what was said in [101]. 
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104, The space of linear operators and the convergence of sequences 
of operators. We have discussed above the space of linear functionals 
and types of convergence (weak and in the norm) of sequences of 
functionals. Let us turn to the same problems for linear operators 
in a B space X. Let Y be the space of all possible linear operators 
in X with a domain of values in some Bspace X’ Addition and multip- 
lication by a number are defined as for functionals: 


(A+ B)x=Ax4+ Bu; (cA)ex=c (Az). (95) 


The norm of an element A € Y is the norm || A || of the correspond- 
ing operator. As in [99], Y can be shown to be a B space. 

We now consider a sequence of linear operators A, (n = 1, 2,...) 
from X into X’. By what has been said, if || 4, — Am || > 0 a8 n and 
m —> oo, there exists a linear operator A such that || 4 — A, ||—> 0, 80 
that we have for any « € X: A,x-> Az in X’. The convergence 
|| 4 — A, }| 0 is called a convergence of operators in the norm. 
Here, || A, || (n == 1, 2, ...) are bounded, as follows from: || Ap, || < 
| A [| WA — 4 Ih 

Notice that mutual convergence in norm is necessary and sufficient 
for the convergence in norm || A — Ap, || > 0, ie. || An — Am || > 0 
as n and m — co (the space of operators is complete). 

Let us write the obvious inequality 


|| Ax — A, || < || A — 4, |]-[I- (96) 


If x belongs to some bounded set U of space X, there exists a d such 
that || x |] < difa € U, and (99) gives || Ax — Apzx || < || A— Ap | Id. 
Hence it follows that, given any « > 0, there exists a subscript V 
(depending on ¢ and not on x) such that || Ax — A,z || < eforn >N 
and x € U,i.e. the convergence of A,z to Az in any bounded set U is 
uniform. We therefore sometimes speak of a uniform convergence of 
operators, instead of a convergence of operators in norm. Let us con- 
sider another convergence of operators. We say that a sequence of 
linear operators A, is strongly convergent to a linear operator A if 
A,x => Ax in X’ for any x € X. 

It may be shown as in [99] that, if A,x is a sequence convergent 
in X for any x € X, the sequence of norms || A,, || is bounded, as also 
that the following holds: the sufficient condition for the strong con- 
vergence of a sequence A, is that the || A, || be bounded (||A, || < C) 
and that A,z be convergent on a lineal dense in X. Suppose that A,x 
is convergent in X’ for any x € X. Let Az denote the limit of 
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A,x(A,x=> Ax in X’). The operator A is distributive in X, because A, 
is distributive, and it is bounded because the sequence || A, || is 
bounded, i.e. A is a linear operator. Therefore, if A,x is convergent 
in X’ for any zx € X, the sequence A, is strongly convergent to a 
linear operator A. Since X’ is complete, it is sufficient to require the 
mutual instead of the simple convergence of the sequence A,z. Thus 
the space of linear operators is complete not only with respect to con- 
vergence in norm, but also with respect to strong convergence. As al- 
ready remarked above, convergence in norm implies strong convergence. 

There is a third convergence of operators. We say that a sequence 
of linear operators A, is weakly convergent to a linear operator A if 
A,2, Az in X’ for any x € X. Obviously, weak convergence follows 
from the strong convergence of operators. Strong and weak con- 
vergence are the same for functionals. 

We defined above the addition of linear operators and their multi- 
plication by a number. If A is a linear operator from X into X’ and 
B is a linear operator from X’ into X”, the operator BA, defined by 


(BA) x = B(Az), 


is a linear operator from X’ into X”. It is distributive because A and B 
are distributive, and bounded because 


|| (BA) 2 || < || Bl] Ax lx <{ Bll l4 I leilx- 


Hence it follows that || BA || < || B|] - || 4 ||. We can also form 
the product of several factors. If A is a linear operator from X into X, 
we can take a positive integral power of it: A? = A(Az) and so on. 

Notice that a product may depend on the order of the factors. If 
say A and B are linear operators from X into X, it is meaningful to 
speak of the following linear operators from X into X: (BA)a = B(Az) 
and (AB)z = A(Bzx). These operators may be different. A similar 
remark applies to several factors. 

A strong convergence of operators is sometimes simply called con- 
vergence. We shall use the notation for it: 4,—> A. Let A, and B, 
be sequences of linear operators from X into X’ and a be a number 
sequence. It is easily shown that, if a,—>a, A,—-» A and B,— B, 
then a,A, > aA and A, + B,— A-+ B. Asimilar assertion holds for 
convergence in norm. If A, are linear operators from X into X’ and 
B, from X’ into X”, the fact that A,—> A and B,— B implies 
B,A,—> BA (the same for convergence in norm). Let us prove the 
last assertion. 
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We have 


BAx — B,A,«% = (B— B,)(Az) + B,(A —A,) 2% 
and 


| BAe — B,Ag xe <||(B — Bp) (Az) IIx + || Ball | (A — 4p) © Ihe 


The first term tends to zero, since B, —> B, and the second because 
the || B, || are bounded and A, — A. 


105. Conjugate operators. Let A be a linear operator from X into 
X’ (X and X’ are B spaces), and 1’(x) be a functional in X’ (l’ € X”*). 
It is easily seen that l’( Az) is now a linear functional in X: 


l’ (Ax) =1 (zx). 


Given A, this equation amounts to a correspondence of an element 
Lé€ X* to every element 1’ € X’*. We can write this as 1 = A*l’, 
where the operator A*, defined in the whole of X’* with a range of 
values in X*, is called the conjugate to A. Linear functionals and the 
operator A are distributive, so that A* is distributive. We now show 
that A* is bounded, and that || A* || = || A |}, the left-hand side being 
the norm of the operator in X* and the right-hand side in X. We have: 


|U(a)| = |U(Ax)| <0 ||. Avi < [Pl] Allied, 


whence || 2 || < || 2’ || - || A ||. But 7 = A*l’, so that || A* || < || 4]. 
Further, let x, be any fixed element of X and I’ an element of X”* 
such that ||’ || = 1 and l’(Azx,) = || Azg ||. We get 


|| Ax, |) =U (Az) = 1 (2) < [EI] - | eo |] = 


= At || [201] < |] A* |] 1C  eoll = A* Lleol, 
ie. || AX || < || A* || + || to |], whence || A || < || A* ||, which, in 
conjunction with || A* || < || A ||, gives |] A* || = || A ||. This leads 


us to the following theorem. 

THEOREM. 7'he operator A*, conjugate to the linear operator A from X 
into X’, is a linear operator from X’* into X* and || A* || = || A ||. 

Note. Notice that, if X’ is the same as X, A* is a linear operator 
into X* with a range of values also in X*. 

If A and B are linear operators from X into X’, it follows from the 
definition of conjugate operator that (A + B)* = A* + B*. If A and 
B are linear operators from X into X, then (BA)* = A*B*, as follows 
from 1(BAzx) = (B*l) (Az) = A*(B*l) (x) = (A*B*l) (xz). In the case 
of a real space (cA)* = cA*, whilst for a complex space (cA)* = cA*. 
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106, Completely continuous operators, A linear operator A from X 
into X’ is said to be completely continuous if it transforms any set 
bounded in X to a set compact in X’. It is easily seen that a distribu- 
tive operator A, defined in the whole of X and transforming every 
bounded set into a compact set, is bounded, i.e. a linear operator. 
For, by hypothesis, A transforms the sphere || z || < 1 into the com- 
pact set Az. But every compact set is bounded, i.e. there exists a 
positive number C such that || Ax || <C for || || <1, whence it 
follows that A is a bounded operator. The definition of completely 
continuous operator can therefore be stated as follows: a distributive 
operator A, defined in the whole of X, is said to be completely contin- 
uous if it transforms every bounded set into a compact set. It follows 
from what has been said that the completely continuous operator 
thus defined is in fact linear. 

THEOREM 1. If A is a completely continuous operator and r,“> xy, 
then Aa, => Ax,. We know that Az,-¥ Az), if A is a linear operator. 
By hypothesis, z, “> x), so that the number sequence || 2p {| is bounded. 
In view of the complete continuity of A, we can extract from the 
sequence Az, a subsequence which is strongly convergent to some 
element y, € X’. On the other hand, it follows from what has been 
said that this subsequence is weakly convergent to Az), so that 
Yo = Az,. Thus every strongly convergent subsequence of the se- 
quence Az,, is strongly convergent to Ax,. We have to show that the 
whole of this sequence is strongly convergent to Az». 

We use reductio ad absurdum. Suppose that there exists a number 
6 > Oand an infinite subsequence Az,, such that || Av, — Ax, || > 6. 

We can extract from the sequence Az,, a strongly convergent sub- 
sequence, and this subsequence must converge strongly to Az, as 
indicated above, which contradicts the inequality || Az;, — Az, || > 
> 6 > 0. The theorem is proved. 

THEOREM 2. Let Am (m= 1, 2, ...) be a sequence of completely con- 
tinuous operators which converge in norm to the linear operator A 
(|| 4 — Am || > 0). The operator A must be completely continuous. 
We have to show that A transforms every bounded sequence of 
elements z, € X (n = 1, 2, ...) into a compact sequence. Given any 
fixed m, the sequence A,,Z, is compact. On the other hand, the con- 
vergence in norm implies the uniform convergence on every bounded 
set. Thus, given any ¢ > 0, there exists an m such that || Av, — Amn || 
<e(n=1,2,...), ie. Av, has a compact ¢« net Ann, whence it 
follows that Az, is compact. 
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It can be shown that, if A is a completely continuous operator, then 
A*, defined in X’* and having a range of values in X*, is also 
completely continuous. 

We shall prove this later for operators in Hilbert space. The theory 
of completely continuous operators will be treated in more detail for 
this space. 


107. Operator equations. We shall now assume that a linear operator 
A, defined in space X, has a range of values also belonging to X 
(X’ coincides with X). Now, A* defined in X* has a range of values 
also in X*, As above, we shall write EH for the operator of the identity 
transformation into any B space, i.e. Ex = x for any x € X. We also 
recall that the annihilation operator is a linear operator transforming 
any element x into the zero element. Its norm is equal to zero, whereas 
that of any other linear operator is positive. 

We take the equation 


(A4—E)r#=y, (97) 
where y is the given and z the required element of X. We rewrite (97) as 
x= Ax — y. (98) 


The right-hand side of this equation is an operator from X into X. 
We write Bx = Ax — y (B is not a linear operator). We have Br, — 
— Bx, = Ax, — Ax, whence || Bx, — Bx, || < || All + []z, — x |]. 
If || A < 1, the principle of compressed mappings is applicable to (98). 
We get the following result. 

THrorem. Jf || A || < 1, given any y € X, equation (97) has a unique 
solution, and this solution can be obtained by the method of successive 
approximations from equation (98) with any initial approximation. We 
shall often be concerned below with an equation containing a para- 


meter: 
(A —AE)x = y. (99) 


If X is a real space of type B, A is a real number. It may be complex 
for complex spaces. Assuming A ¥ 0, (99) can be written as ( A— 
— E)x = 1/2 + y. It follows from the last theorem that, if | A | > || A ||, 


(99) has a unique solution for any y, and this solution can be obtained 


by the method of successive approximations from the equation 


1 1] 
t= = Ar — > Y. 
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Equation (97) is occasionally written as 
(AA —- E)x=y or (E—AA)x=y. 


The condition |4|> || A || is here replaced by | 4|< || A ||7}. 
Now let A be a completely continuous operator, and let us write 
two equations, one in space X and the other in X*: 


(A—E)x=y (100) 
(A* — E)a* = y*, (101) 


where y and y* are the given, and z, z* the required elements. 
We also write down the corresponding homogeneous equations 


(A— E)x =0 (102) 
(A* — EB) a* = 6°, (103) 


where 6 and 6* are the zero elements in X and X* respectively. The 
sets of solutions of these equations are lineals. We shall next state 
the results regarding the equations written. We shall prove them in 
the next chapter for the case of Hilbert space. 

If one of equations (100) or (101) has a solution, given any right-hand 
side, the other equation has the same property. In this case, given any 
right-hand side, each equation has a unique solution, i.e. (102) and 
(103) only have the trivial solutions x = 6 and 2* = 6*. 

If one of equations (102) or (103) has a non-zero solution, the other 
has this property, and the number of linearly independent solutions 
is finite and the same for (102) and (103). The lineals of the solutions 
are finite-dimensional subspaces. Here, the necessary and sufficient 
condition for (100) to be soluble is that y be orthogonal to all the 
solutions of (103), whilst the necessary and sufficient condition for 
(101) to be soluble is that y* be orthogonal to all the solutions of (102) 
[IV; 9]. 

These results are naturally preserved for equations with a para- 
meter. In the case of a complex B space, the equations are written as 


(A —AE)x=y (104); (A* — AE) x* = y*; (105) 
(A4—AE)x =6@ (106); (A* —JE)x*=0*. (107) 

A fur her result must be mentioned. There is only a finite number 
of values of A, satisfying the condition | A | < R, where # is any given 


positive number, for which (106), and hence also (107) have non-trivial 
solutions (not 2 = @ or 2* = 6%). 
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The above results are precisely analogous to the theorems that we 
had in the theory of integral equations. 

If X is a real B space, 1 must be taken as real and A = A. 

A 4 for which (106) has non-trivial solutions is called an eigenvalue 
of the operator A, and the number of linearly independent solutions 
of the equation is called the rank of the eigenvalue. The solutions 
2X4, Xy,...,Xm form a complete set of linearly independent solutions 
of (106) if the general form of any solution of the equation is 7 = 
= ¢,% + ¢,% +... + ¢Cm%m, where the c, are arbitrary numbers. 
The representation of any solution z in this form is unique, in view 
of the linear independence of the x;. The linearly independent solutions 
can be chosen in different ways, but the number of solutions is always 
the same in a complete set. If (106) has non-trivial solutions and the 
element y appearing in (104) satisfies the above-mentioned solubility 
condition, all the solutions of (104) are expressible by 


B= Hy + CX, + Coy +... + Cy ms (108) 


where 2, is any given solution of (104), 2, a, ...,%m is @ complete 
set of linearly independent solutions of (106) and c, are arbitrary 
numbers. All this follows directly from the linearity of (104) and (106) 
[IV; 9, 10]. 


108. Completely continuous operators in C, LZ, and /,. 1. We consider 


the integral operator 
b 


y (u) = | K (x,t) p(t) de (109) 
a 

in space C, where [a, b] is a finite interval. If the kernel K(q, ¢) is con- 
tinuous in the square @ [a<a<bja<y<b}j, (109) obviously 
yields a distributive operator from C into C. The fact that it is bounded 

follows at once from 

6 

max |p(z)|< max |yp(t)| max f |K(z,t)|dt. (110) 


asxgb axst<oé aixsb g 


If U is a bounded set of functions y(¢) in C, ie. max | p(t) | < A, 
a<t<b 


eS 


it is readily seen that the set of corresponding (xz) is compact. Its 
boundedness follows at once from (110), since max | y(t)| < 4, 
whilst the equicontinuity follows from tats? 


6 
| p (%2) — v(x,)| < AJ |K (x,t) — K (ay, t) | dé. (111) 
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Thus operator (109) is completely continuous in C when the kernel 


is continuous in Q. 
It can be shown that the norm of operator (109) is strictly equal to 


b 
max (| K (x,t) | dt. 
aixgb g 
Operator (109) is completely continuous in C with fewer assumptions 
regarding the kernel. Suppose e.g. that K(z, ¢) is a bounded function 
measurable in the square @ and that 
lim K (x’,t) = K (2, t) (112) 


xx 


for any x of [a, b] for almost all ¢. Now [54]: 


b 
lim { | K (x', t) — K (x,t)|dt =0 


XX 
and, given any e > 0, there exists an 7 > 0 such that 


{| K (a, t) — K (x, t)|dt <e for |#,—2,|< 7. 


This last is proved in the same way as the uniform continuity of a 
function continuous on a finite closed interval (I; 43]. 

The proof that g(x) is bounded and equicontinuous is the same as 
above. We shall later investigate in detail integral operators with 
polar kernels. 

2. Let us now consider operator (109) in L, (p > 1) on theassumption 
that the kernel K(z, t) € £,,{Q), ie. 


b 6 
{ § |K(a, # pede dt= AP < 4 00. (113) 
ae 


If p(x) is any function of L, [a, 6], integral (109) has a meaning. 
It is easily shown that it defines a measurable function g(x) [cf. 68}. 
We have by Hélder’s formula: 

|p (z)| < 


t) lr b 1 
([K@ 9 prde | Se a) led. 


On raising both sides to the power p’ and integrating with respect 
to 2, we get 


ole. <A’ vie» ie. Welle <A llvile,: (114) 
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i.e. (114) is a linear operator from L, into L,. It can be shown that A 
is the norm of this operator. Let us show that (114) is completely con- 
tinuous. Let U be a set of functions y(x) which is bounded in Lp, and 
V the set of corresponding g(x) € Lp. We have to show that V is 
compact. By hypothesis, || p ||Z, < C, ify(z) € U, and it follows from 
(114) that || 9 ||[L, < AC. It remains to show that g(x) are equicon- 
tinuous in the mean. On extending g(x) by zero outside [a, 6], and 
K(x, t) by zero outside Q, we have 


pz + h) — oa = file + h,t) — K(x, t)] p(t) dé 


whence, as above, 
b 


b 1 
lipz +h) — p(2)IILp <[f {| K@ +h. t)—K(a, t)|?" da dé)?’ |Iyl|_, 
1.e. 


bb 1 
lp +h) ~ o(x)Ilep <[ f (|K(@ +h, t) —K(q, t)P’dade]rC. (115) 


Since K(z, t) is continuous in the mean in Zp on Q, given any 
é > 0, there exists an 7 > 0 such that 


db 
K(x + h,y) — K(x, yi" dxdt < “for [hl < 
cP 
aa 


and it follows from (115) that 
Ilp(a + h) — 9(@)\lbp <@ for || <9 


where 7 is the same for all g(x) € V, which is what we set out to prove. 
The proof is similar for the case of an infinite interval and several 
independent variables. 

3. We now consider the operator from J, into l,. (p > 1) given by 


N= 4,6; +4n5,+..- (116) 
on condition that 


> lanl? = A < +0. (117) 
i, k=l 
On using the notation 2(&,, &, ...)and y(, 7, ...) for the elements 
and applying Hdlder’s inequality for sums, we obtain, precisely as 
above, 


lIylh,. < A lle 


tp» (118) 
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so that operator (116) is a linear operator from J, into l,. Let us show 
that it is completely continuous. Let U be a bounded set of elements 
z € Ll, (||x|| <C) and V the corresponding set of elements y € [),. 
We have to show that V is compact. It is bounded by virtue of (118), 
and it remains to show that, given any « > 0, there exists a positive 
integer n, such that 


> nil?’ < eF”. (119) 


i=rg 


We have by (116) and Hélder’s inequality: 


oo 


> me < > Pa Janae < > = | [inl CP". (120) 
i=ne i=ng k=1 f=ng k= 

By (117), the double series with general term | aj, |? ‘is convergent, 
so that there exists an n, such that 


tl Soe (121) 


whence (119) follows, in view of (120). 


109. Generalized derivatives. We shall now introduce a new type 
of derivative, which is often employed in modern mathematical 
physics. Let D be a bounded domain of n-dimensional Euclidean 
space R,, a point 2 of which is defined by the Cartesian coordinates 
(24, %, -.+, 2%). We shall always take a domain to mean an open 
connected set, and shall assume that the boundaries of any domains 
discussed have zero volume measure. As usual, we shall write D for 
the domain D along with its boundary (the closed domain). We shall 
say that D’ lies strictly inside D if D’c D and the distance from D’ 
to the boundary of D is positive. This is equivalent to the fact that 
D’c D. As earlier, we shall describe a function as finite in D if it is 
zero outside some domain D’ lying strictly inside D (D’ may be 
different for different functions). Let g(x) have continuous derivatives 
up to order / inside D and let p(x) be finite. Let us consider a derivative 
of order 1: 

____oy 
Oath Gale. Aalt ” 


D' 9 (122) 


Using the formula for integration by parts, and the fact that p(x) 
is finite, we obtain 


jD g(x) p(x) da = (—1)!' { o(x) D!' y(a) da. (123) 
D D 
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A more general concept of derivative can be based on (123). 
DEFINITION 1. Let o(x) and (x) be summable over any subdomain D’ 
lying strictly inside D, and let 


Sale ) p(x) da = (— 1) YS Pla) D' w(x) dz. (124) 


for any finite | times continuously differentiable function (x). 

In this case, x(x) is called the generalized derivative of the form (122) 
of o(x) in D. 

Let us show that only one generalized derivative of a given form can 
exist for any given g(x). Let x(x) and y,(x) be two generalized deriva- 
tives. Equation (124) holds for both g(x) and y,(xz). Term by term 
subtraction gives 

§ (x (@) — 11 (2)] ya) de = 0, (125) 
D 


whence it follows, since the finite function (zx) is arbitrary, that 
x(v) and y,(x) are equivalent functions in D [71]. 

If g(x) has continuous derivatives up to order / inside D, (123) 
holds and x(x) = D'g(z). We shall retain the notation (122) in future 
for the generalized derivatives. Let us note some properties of the 
generalized derivative that follow directly from the definition. The 
generalized derivative D'y(x) does not depend on the order in which 
the differentiations are written, since the order of differentiating (7), 
which has continuous derivatives, is arbitrary in (124). If p(x) and 
y,(%) have generalized derivatives y,(z) and y,(x) of type (122), 
C, 9,(Z) + cy y,(2) has the generalized derivative c, 4,(2) + cz x2(z) of 
the same type (c, and c, are constants). If x(x) is the generalized 
derivative of g(x) in D, it will be the generalized derivative of the same 
type in any domain D’ belonging to D. 

If y(z) has a generalized derivative dy(zx)/dx, = x(x) and y(z) has 
a generalized derivative d4(x)/0xz., y(x) has a generalized derivative 
8? p(x)/Ox, Ox, = Oy(x)/dx,. Similarly for the other types of derivative. 
Further, if p(x) has generalized derivatives dp(x)/dxz, and 0 y(x)/02, 022, 
then 0°g(x)/d%,dx2, is the generalized derivative of d(z)/dx, with 
respect to z,. Below, we shall also prove that, given certain auxiliary 
restrictions, the usual formula for differentiation of a aaa holds 

LM ae (2)) _ 18) 2 (2) +p, (w) 22. See (e) ; (126) 

We now establish the connection between sella differentiation 

and the averaging operation. Let w,(| z — y |) be any given averaging 
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kernel, depending on the distance beween the points 7 and y, and 
g(x) the mean functions formed for (2): 


Pn (%) = a [oon(|e — y|) p(y) dy. (127) 


Assuming that g(x) has the generalized derivative (x) = D' ¢(z) 
of type (122) in D, let us work out the corresponding (obviously ordin- 
ary) derivative of the mean functions [71]: 


Dy Pp, (") = ar fou) y) Dy, @, (|v —- yl) dy = 


= = few) y) Di, wp (\e — yl) dy. (128) 


We shall assume that the point z € D is at a distance greater than h 
from the boundary of D. Since the function w,( |v — y |) vanishes 
outside a sphere of radius h with centre at the point z, it can be taken 
as the finite function in (124). Together with (128), this leads to 


Din a) = 5a {on (le — vl) Ds oy) dy, (129) 


which can be stated as: the mean functions of the generalized deriv- 
atives coincide with the derivatives of the same type of the mean 
functions at all points of the domain D whose distance from the 
boundary is greater than the averaging radius. 

We can now say, on the basis of the properties of the mean functions 
[71] that, as h— 0, g(x) —> 9(z) and D'y,(x) > D'g(z) in L(D’), 
where D’ is any strictly interior subdomain of D. Furthermore, if we 
make the supplementary assumption that g(x) is summable over any 
strictly interior subdomain D’ to any given degree p > 1, and the 
generalized derivative D'g(z) to any given degree g > 1, we have 
convergence of (x) and D'g,(z) in L,(D’) and L,(D’) respectively. 
A word of warning. Suppose the definition of p(x) is somehow extended 
to the whole of Rp, e.g. it is put equal to zero outside D. The y,(z) are 
now also defined in the whole of space and converge to ¢(x) in L,(D) 
as h—> 0. But the functions D'y,(x) will not in general be convergent 
to D'y(x) in space L,(D). This is bound up with the fact that the 
extended function g(x) may not have the corresponding generalized 
derivative throughout &,. 

Let us now turn to the proof of (126) for the differentiation of a 
product. We must first ey a simple proposition. Let y,(z) € L,(D’) 
(p > 1) and ¢,(x) € Lp(D’) (1/p + 1/p’ = 1) in any strictly interior 
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domain D’ of D and let p(x) be a bounded function finite in D. Then 
Pin (X) Pan (%) y(%) ines i Py (%) Po (%) p(x) dx. 

9D 


In fact, we find by using Hoélder’s inequality: 
[J Pan 2) Pan) — 2 (2) 93 (@)] vle) de| < 


< {Pan (@)|+| Pan (@) 92 (@) || plw)| dae + 
D 

+ {ee (&)|- [pin (@) — 1 (@)| | vl@)| da < 
D 


< CUlPralle 9° Pon — Pallegor + [IPalleews: || Pin — Pr Ile (09]- 

Here, C = sup | (x) | and D’ is the subdomain of D outside which 
y(x) vanishes. The right-hand side of the last inequality tends to zero 
as h—> 0, since yy, (x) —> ¢,(z) in Lp(D’), on (x) —> y,(z) in L,(D’) and 
the sequence ¢,,(x), convergent in L,(D’), is bounded in norm in L,(D’). 

We shall now establish (126) on the assumption that ¢,(z) and 
O9,(x)/ox, € L,(D’), whilst (x) and 69,(x)/dx, € Lp(D’) for any 
strictly interior subdomain D’. Making use of the previous proposition, 
we have for any continuously differentiable finite function y(z), 
vanishing outside D’ [62): 


d : 5 
[os a) ae dz = lim [ou (©) Pop (2) oe) dx 
D 1 h>0 b 1 


so that 
@ . ay, 
fos) 0 (@) BE da = — tim f[ PH) gry, (x) + 
D ' Han 
é 
+ Py (2) PH) | ia) de. (130) 


Given sufficiently small h, we can apply (129) in D’ and replace 
Op n(%)/Ox, in (1380) by (Oy,(x)/Ox,)_, and Ope,(x)/Ox, by (Ogp,(%)/02,),. 
On again applying our auxiliary proposition to the right-hand side of 
(130), we get 

0 e ) 
| P1@) v2 (@) = dx = —{[- G2 (x) + 9, (2) Het | (a) de. 
D D- 


On, 


This last equation implies that the product 9,(z)p,(z) has a general- 
ized derivative with respect to x, in D, which can be evaluated in 
accordance with (126). 
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Notice that (126) also holds for p = 1. In this case we have to take 
p’ = ©, ie. assume that (xz) and d¢,(%)/dx, are bounded in any 
subdomain D’. 

We now show that a second definition of generalized derivative 
can be given, and its equivalence to the original deflnition established 
on the basis of (129). 

DEFINITION 2. The function y(x) is called the generalized derivativ 
of type (122) of a function p(x) in D tf there exists a sequence of functions 
@m(x), L times continuously differentiable inside D, such that »,,(x) and 
D'o n(x) are convergent to o(x) and x(x) respectively in L(Dy, where D’ 
is any strictly interior subdomain of D. 

THEOREM 1. Definitions 1 and 2 are equivalent. Let y(x) be the 
generalized derivative of y(x) in accordance with the second definition. 
Equation (123) holds, when ¢g(z) is replaced by y(x), and, since 
@m(t) > p(x) and D'o_(x) > x(x) in L(D’), given any choice of finite 
iunction (x) with the above-mentioned properties, we can pass to the 
fmit under the integral sign [cf. 62], whence (124) follows. 

Now let x(x) be the generalized derivative of g(x) in the sense of 
the first definition. By (129) and Theorem 4 of [71], the sequence of 
Ym(x) required by the second definition is given by the mean functions 
fn, (Z), given any sequence h,, tending to zero (we are assuming that 
y(xz) is continued by zero outside D). Theorem 1 is proved. It follows 
from this theorem that the generalized derivative is unique if it exists, 
in the sense of the second definition. 

We now prove a theorem to the effect that generalized derivatives 
are capable of weak convergence in L,(D’). 

THEOREM 2. Let y,(z) (k = 1, 2, ...), defined inside D, be weakly 
convergent to a function g(x) in L,(D’) (p > 1), where D’ is any domain 
lying strictly inside D, have generalized derivatives D'y,(x) of form (122) 
in D and norms D'p,(x) in L,(D’) bounded by some number M(D’), 
which depends on the choice of D’. Then o(x) has a generalized derivative 
D'p(x) of form (122) in D, equal to the weak limit of D'oy{x) in LAD’). 

Proor. In view of the we akcompactness of bounded sets in L, for 
p > 1, the inequality 


[| D! ex \|L,(0) < M(D"’) (131) 


implies the existence of a subsequence ,,(z) such that D'pn,(2) are 
weakly convergent in Z,(D’). By taking a sequence of strictly interior 
expanding domains D;, convergent to D, we can form with the aid of 
a diagonal process a subsequence D'¢m,(2) for which the derivatives 
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D'¢m,(2) are weakly convergent in L,(D’) to some function x(z) in any 
strictly interior subdomain D’. It is clear that (x) is defined every- 
where in D and belongs to L,(D’) for a strictly interior domain D’. 

Equation (123) holds when (2) is replaced by 9m,(z). On passing to 
the limit in it with p(x) fixed, and observing that p(x) is finite, we 
arrive at (124) (weak convergence), whence it follows that (xz) is the 
generalized derivative of g(x) in D. It follows from what has been said 
that any weakly convergent subsequence D'gm,(2) has the same limit 
x(x) (the generalized derivative is unique), and we can easily conclude 
from this that the entire sequence D' p(x) is weakly convergent to x(2). 

Notes 1. This last theorem shows that, if g(x) € L,(D’) and (131) 
holds for the derivatives of the mean functions »,(x), there exists in D 
the generalized derivative D'g(x) € L,(D’). We have already seen 
that in this case D'p,(x)—> D'g¢(x) in Z,(D’), so that the norm of 
D'p(z) satisfies (131). 

2. In the conditions of the theorem, functions g(x) and D'g(z) may 
belong to L,(D’) and L,(D’) respectively, with p # q. 

3. Theorem 2 remains in force with p = 1, if, instead of (131), we 
assume the weak compactness of functions D'g,(x) in L(D’) for any 
strictly interior subdomain D’ of D. 


110. Generalized derivatives (continued). Let us now establish the 
connection between the existence of the generalized derivatives 
and the absolute continuity of functions. We take the case of one 
independent variable and 0 < 2 < 1 as the fundamental domain D. 
Let g(x) be absolutely continuous in [0, 1]. As we know from [74], 
g(x) has a derivative y’(x) in [0, 1], which is summable in [0, 1]. The 
formula for integration by parts [74] gives, for any continuously 
differentiable finite function y(z), 


1 1 
p(x) y' (x) da = — [¢' (x) p(x) da, (132) 
i) 


0 


which shows that »’(z) is the generalized derivative of ¢(2). 

Now let (x) € E ((0, 1}) and have a generalized derivative dy(x)/dx 
in D, belonging to L ([0, 1]). Let us show that g(x) is now equivalent 
to some function absolutely continuous in [0, 1}. 

We write 


x 
d 
pr(x) = [8 ae 
0 
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and observe that 9,(x) is absolutely continuous and that its derivative 
9i(x) is equivalent to dg(z)/dv [74]. The difference p*(z) = o(z) — 

— 9,(x) obviously has a derivative equivalent to zero. We fix « > O and 
consider the interval [«, 1 — «]. Given sufficiently small h, the deriv- 
ative of the mean function g(x) is equal to zero in [«, 1 — e], so that 
gz(z) is constant in [e, 1 — e]. Since a limit of constants must be 
constant, and p(x) > o*(x) in ZL ([e, 1 — €]), v*(x) is equivalent to a 
constant in [e, 1 — ¢]. Hence it follows that 


pla) = (0) + {8 ae (133) 
0 


everywhere in D, discounting equivalence. We have thus established 
that the existence of a generalized derivative is equivalent to the ab- 
solute continuity of g(x). It may be shown similarly, for the case 
of several independent variables, that, if (x, V,..-, Um) has a 
generalized derivative d(xz)/dx, say in the cube [(0<a4,<1;k= 
= 1, 2,...,n] and 9(z) and d9(x)/dx, € Lp (p > 1) in this cube, then 
g(x) is absolutely continuous for 0 < x, < 1 for almost all values of 
(Xq, Zy, »--, Xn) in the cube [0 < a < 1;k = 2,3, ..., 2] and we have 
X 

(Wy Wy. . +5 py) = (Oy yy op Bp) + i OO Fe en1%m) dt (134) 
for all x, of [0, 1]. 

This equation, like (133), needs some explanation. The function (zx) 
and its generalized derivative Dg(x) are defined up to a factor 
of measure zero; thus (133) and (134) have to be understood in the 
sense that there are functions of the class of functions equivalent to 
g(x) for which these equations hold. 

We shall now give an example of a function 9(2,, 2,), having a 
generalized mixed derivative 0? y(2,, 2,)/0x, 0x,, but not having 
generalized first derivatives. The function 9(2,, %) = f(2,) + (22) 
(0 < 2, <1; % = 1, 2) has this property, where /(x) is the continuous 
function of [76]. The function ¢(zx,, x) has no generalized first deriv- 
atives, since /(z) is not absolutely continuous. Whereas the generalized 
derivative 3? p(x, x.)/dx, 02, exists and is exactly equal to zero. For, 
given any smooth finite function »(z,, 2), we have 


[reey a a oo. ad at a J ae, | dea rN : =] a a 


0 


Se 
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and similarly for /(2,), i.e 


11 
f fete + fey] “Ee da, da, = 0, 
00 


whence it follows (definition 1) that the generalized derivative 


0? Y(2, @2) =f) 
Ox, Ox, : 

It is worth noticing that, if g(x, 2%, ...,%,) is continuous in D 
and if D can be divided with the aid of a finite number of smooth 
surfaces into a finite number of domains D; (i = 1, 2, ..., 2), in each 
of which g(x) is continuously differentiable with respect to an 2; as 
far as the boundary, then g(x) has a generalized derivative in D 
equal to dp(x)/8x, in each of the D,. This derivative can have discon- 
tinuities of the first kind on the above-mentioned surfaces. Our 
assertion follows at once from the formula for integration by parts: 


) 
fre) So ae =— ae p(x) dx = |e ) p(x) cos (n, x,) dS, 


Dy Ds (135) 
where 8; is the boundary of D; and 7 the direction of the normal to 
S;, outward with respect to D,. We only need to observe that the 
integrals over the surfaces S; cancel on summation over ¢. 

If (x) has different limiting values on an (n — 1)-dimensional piece 
* of surface, lying in D, and the direction 2; does not lie in the tangent 
plane to this surface, no generalized derivative d¢(x)/dx; exists in D. 
This follows from the connection established above between the 
absolute continuity of y(z) and the existence of the generalized 
derivative. 

Note. Instead of the concept of a separate generalized derivative 
of a summable function 9(z), we can introduce the concept of a gener- 
alized linear differential operator of any order, say 


. a 
AAP) 2% Ms ime, + Sb, oe re + cp (x), (136) 


where the coefficients are sufficiently smooth functions of (2, 2, ..., 
In). 
Such a generalized operator is defined by an equation analogous 
to (124): 
{ (x) M(y) dae = f L(p) v(x) de, (137) 
D D 
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where M(y) is the conjugate differential operator and y(z) is any 
smooth function finite in D [IV; 158]. The existence of the individual 
derivatives appearing in the operator L(g) is not assumed here. 


111. The case of a star-shaped domain. We have shown [109] 
with the aid of mean functions that, given any 9(z) of L,(D’) (p > 1) 
having a generalized derivative 4(x) = D'g(z) also of L,(D’), there 
exists a sequence of g(x), 1 times continuously differentiable in D, 
such that 9;(z) — y(xz) and D',(2) —> x(x) in L,(D’). (Here, as above, 
D’ is any strictly interior subdomain of D). We now show that an 
analogous approximation of functions (x) and D'¢(z) is also possible 
in space L,(D) for an important class of domains. 

We shall describe D as a star-shaped domain if there exists an interior 
point x, such that every radius vector from x, cuts the boundary in 
only one point. It may also be said that the domain is star-shaped 
with respect to the point Zp. 

THEorEM. Let D be a star-shaped domain and (x) have a generalized 
derivative D'g(x) in D, where g(x) and D'g(x) belong to LD) (p > 1). 
Then there exists a sequence o,(x) of functions | times continuously dif- 
ferentiable in D such that ,(x) and D',{x) are convergent to o(x) and 
D'y(x) in L,(D). 

We speak of functions ¢;(<), / times continuously differentiable in D, 
if (x) are continuous in D, are continuously differentiable inside D 
up to order J, and their derivatives can be given a supplementary 
definition on the boundary of D in such a way as to obtain functions 
continuous in D. 

We take the above x, as origin and form the sequence of functions 
g([k — 1]/k) w (k = 2, 8, 4, ...), defined inside the domains D,;, con- 
taining D strictly inside themselves, D, being got from D by a similitude 
transformation with similitude coefficient k/(k — 1). 

We write o([k — 1]/k) z= g(x) and show that the g(x) are con- 
vergent in L,(D) to (x), whilst the generalized derivatives D ‘p (x) are 
convergent in L£,(D) to x(x) = Di g(x). Let us prove say the second 
assertion. We have 


|| Dip (x) — D' ye (x) |j = | {leo = SEE | ( az |p < 


< [i -(R (=) eles “2 Parle +] Ile — (eG =) P aefp 
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The distance between the points ([k — 1]/k) a and z does not exceed 
d/k, where d is the diameter of D, and consequently tends to zero 
uniformly in D. On repeating the argument of the theorem on con- 
tinuity in the mean [70], the second term on the right-hand side of 
the last inequality will be seen to tend to zero as k > co. 

The factor 1 — ([k — 1}/k)' in the first term tends to zero, whilst 
the second factor is convergent by what has been said to |] 4(2) ||z,.p) 
(continuity of the norm). The proof of the convergence of oa). to 
g(z) in L,(D) is even simpler. We observe that, with k fixed, D is 
strictly interior with respect to D,, so that the mean functions h(a) 
are convergent to y“'(x), whilst their derivatives D'p{(x) are con- 
vergent to D'o™ xz) in L p(D) as h-» 0. Hence it follows that Ve) and 

x(x) can be approximated in the metric of Z,(D) by functions Phy (a2) 
and D' g(x) infinitely differentiable 2 D with a suitable choies of 
the sequence h,—> 0. The functions 9), 4 (a2) can be taken as the 9,(z) 
of the statement of the theorem. The theorem is proved. 


112. Spaces fy and W{). As above, let D be a bounded domain 
of n-dimensional space. We consider the set of all functions (2) 
having all generalized derivatives of order 1, where g(z) and all the 
D'g(z) belong to L,(D) (p > 1). This class of functions will be denoted 
by WD); it can be converted into a normed space by introducing 
the norm in accordance with the formula 


al 

)|P lao a Pa 138 

Velie] ieceyPdet fe laak aks) A (188) 

Here and below, = denotes summation over all possible 
+ht.. al 

sets of natural numbers (I, /,, ..-,2,), the sums of which yield /. The 


fundamental properties of the norm, as indicated in [95], may easily 
be verified. Let us show that space W? is complete. Let o;(z) be a 
mutually convergent sequence in WD), i.e. 


Op, (a) a Pm (x) 
2) — Pen (x2)? + aS ee 
a [ex ) = Pra ( )| ae Oa ... onl Oa'2, P . Oacln 


*|ae— 0 


as k and m-—> oo. Hence it follows that the sequences »,(zx) and 
D'g,{2) are mutually convergent in L,(D). In view of the completeness 
of L,(D) and Theorem 2 of [109], we find that y,(x) are convergent in 
L,(D) to some function 9(z), this latter having all possible generalized 
derivatives of order J from L,(D) and D'g,(2) > D'p(x) in L,(D). 
What has been said is equivalent to the convergence of g,(x) to o(x) 
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in WO(D). Thus WD) is a B space (a complete linear normed space). 
Let us dwell on the proof that WD) is separable. To this end, we 
represent D as a denumerable set of non-overlapping semi-open inter- 
vals [32]. We enumerate the corresponding open intervals, written 
as D, (k = 1, 2, ...) and introduce the set VOD) of functions 9(z), 
belonging to WO(D,) in each interval D;, and such that the series is 
convergent: 


oo 


alasey ie = Il? iron : (139) 

Specification of the norm in accordance with (139) converts V9(D) 
into a linear normed space. It is easily seen that functions of WD) 
belong to V$(D) and || 9 liga) = Il 9 IIMgw. Thus WH(D) is a 
subspace of VOD), and it is sufficient for us to show that the latter 
is separable [94]. 

The set of functions of VD), differing from zero only in a finite 
number of intervals, is dense in V(D). For, let g(x) € VO(D) and 


m 
e > 0 be an arbitrary number. We put 9m(z) = 9(x) forz € > D, 
K=1 


and g(x) = 0 in the remaining part of D. Obviously, gn(x) € VO(D), 
and for sufficiently large m: 


9° 


|e — Pm I|Pv py = ama Il Ii (on < é, 
in view of the convergence of series (139). The functions o,,(z) can be 
approximated in the metric of W(Dy) in each of the intervals D, 
(k < _m) by 1 times continuously differentiable functions [111], and 
in turn the latter can be uniformly approximated together with their 
derivatives in D, by polynomials with rational coefficients. Hence it 
follows that the set of functions, each of which differs from zero only 
in a finite number of intervals D, and coincides with a polynomial 
with rational coefficients in each of these, is dense in V9(D). It is 
easily seen that the set of such functions is denumerable, i.e. VOD) 
is separable. This proves that space W%)(D) is separable. 
We must now dwell on a special problem. Suppose that, in addition 
to the basic norm || z ||, another norm || 2z ||; is introduced into a 
linear normed space X, where for all 7 € X: 


cellel| < [el <all|) (140) 


and c, > 0, c, > 0 are constants. Norms satisfying condition (140) 
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are said to be equivalent. Obviously, a sequence z,, convergent in one 
norm, is convergent in the other. It is unimportant which equivalent 
norm is considered when deciding about denseness, separability, 
compactness etc. Similarly, a distributive operator, bounded in one 
norm, is also bounded with respect to another (equivalent) norm. 
Generally speaking, the norm of a bounded operator changes when 
passing to an equivalent norm in the space, but remains finite. 

We must now consider in more detail the question of equivalent 
norms in n-dimensional real Euclidean space R,. We have introduced 
a norm into #&, in accordance with 


jal] = Va? + az+ ... + a. (141) 


Now let the function f(x) = f(x,, %, ...,%,) have the properties 
of the norm [95] and, in addition, be continuous on the surface of the 
unit sphere 2} +aop+... +22 =1. We show that || 2 ||, = /(z) 
represents a norm equivalent to (141). The continuous /(z) attains its 
maximum and minimum on the surface of the unit sphere. Let us 
write c, = sup f(x) and c, = inf f(x) for || x || = 1. Since f(z) is positive 
and continuous, we have 0 < ¢<c, < +o. In view of the pro- 
perties of f(z) everywhere in E,, 


eallell <f(%) <e fel), 


i.e. norms (141) and f(z) are equivalent. 
We can take as f(z) say 


@) =| Siar (p > 1) GE hes mae |e. 


Tt can eas ily be shown, on the basis of these remarks, that the norms 
given in W sD) by 


lel= = Pel tow + MP coo (142) 
Arena 
IIe || = max De |it,(oy + | elle, (143) 
fad 
Hell= 1 0 2, Dio]? da} + |e llgo (144) 
rte thas 


are equivalent to the basic norm (138). We shall make use of all this 
later. 
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We now introduce a further function space. Given a set of functions 
having all possible generalized derivatives in D up to and including 
order J, and belonging to Z,(D) together with their derivatives, we 
define the norm by 


1 
lel Wom= > DS Pell. (145) 


k=0 ki +...+h,p=k 


Let W{(D) denote the linear normed space obtained. It can be 
shown, in the same way as for W( (D), that our new space is complete 
and separable. The above remark about equivalent norms in WD) 
applies in equal measures to W})(D). We shall show below [116] that, 
for a fairly wide class of domains, spaces WD) and WD) consist 
of the same set of functions, and norms (138) and (145) are Le 
When / = 0 and J = 1, the two spaces coincide by definition, W)(D) 
being obviously L,(D). 


113. Properties of functions of space W) (D), A systematic study 
of the properties of functions of spaces W(D) and WD) will be 
undertaken in the next few sections, devoted to so-called embedding 
theorems. The results of the present section are particular cases of 
these general theorems, but will be proved independently (by simpler 
means) in view of their importance. 

We must first note a property of WD), that follows directly from 
the definition. Suppose we have a change of variables x (2, %, ..., Zn) 
to ¥ (Y Yo: »»+» Yn) such that D is mapped one-to-one on to a do- 
main D,, the mapping being expressed on both sides by functions 
having continuous derivatives up to order / in the corresponding closed 
domains. If our change of independent variables is carried out for the 
functions, space W)(D) becomes W%D,). 

It follows at once from the definition of W{(D) that, if y(zx) € 
€ WD), then (xz) € WY"(D) for gq < p and m <1. We shall prove 
the following theorem in connection with this. 

THEOREM 1. If U is a set of elements g(x) € WD), bounded in 
WD) (1 > 1), at ts compact in WY-)(D’), where D’ is any domain 
lying strictly inside D. We shall first prove the theorem for / = 1. 
There exists by hypothesis a constant C such that, if g(z) € U, then 


Oks. 


le |p) = Sle a) |P + > ae | [az < OP. (146) 
r) 
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We have to show that the set U is compact in L,(D’), where D’ is 

a fixed domain lying strictly inside D. The fact that U is bounded in 

L,(D’) is an immediate consequence of (146). It remains to show the 

equicontinuity in the mean of g(z) € U in £,(D’) [70]. Let us show 

that, for sufficiently small | 4x | = \(Az,)? + (Axg)? + ... + (Azz), 
we have 

Jie + Ax) — (x) |P da < C,|4z/?, (147) 


where C, is a constant, the same for all o(x) € U, whence the equi- 
continuity follows. 

We can assume, by rotating the coordinate axes if necessary, that 
Az is (Ax,, 0, 0, ..., 0) and Az, > 0. Let D” be a domain of the same 
type as D’, D’ being strictly inside D’. We shall assume Jz, so small 
that z+ Az does not go outside D” when x € D’. We suppose first 
that g(x) is continuously differentiable in D. Obviously, 


dz. 


Ax, 
R= [|p (e+ dx) — 9 (x) Pde =| | | Bp (21% te «oad a? 
~ py o 


When p > 1, we can apply Hélder’s inequality to the inner integral: 


Ax, 


Re < [ (dz “PI | Peach teeta) Pde] de = 


D’ 


dz| dr< 


= a Op (x, + T,X, ..., 2p) | P 
=(4e,)” | | | 28 


£ ax, 


< (4a) | [9] Oxon de = (Aay)? | 9 [Bory 
0 


Hence 


Sle + Aa) — p (x) |P dx < (Ax,)? P| WXD% - (148) 


When p = 1, this inequality is obtained directly by changing the 
order of integration. Inequality (148) holds for any function g(x) of 
WO(D ), as well as for continuously differentiable functions. This is 
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easily seen by choosing a sequence of continuously differentiable 
functions convergent to g(x) in WD), and passing to the limit in 
(148). In conjunction with (146), (148) leads to (147). The theorem is 
thus proved for 1 = 1. Now let J = 2. On observing that say 6°9(2x)/O2} 
is the generalized derivative of dy(2x)/0x, with respect to 7,, we can 
prove the theorem for / = 2 by applying it with 7 = 1. The case of 
any lis similarly considered. 

A further theorem must be mentioned, which is a direct consequence 
of the theorem proved in [IV; 156]. 

THEOREM 2. If a sequence of functions p,(x), continuous and having 
continuous derivatives up to order |= [n/2]-+ 1, ts convergent in 
WD’), where D’ is any domain lying strictly inside D, the ,{x) are 
uniformly convergent in any domain D’. 

It follows at once from what has been said that the limit function 
g(x) is continuous inside D. Now suppose that we have a function 
p(x) € W?(D), where 1 > {n/2] + 1, and (x) are mean functions 
for g(x), and the averaging radius tends to zero as k > co. On taking 
into account the property of mean functions of the generalized deriv- 
atives [109] and Theorem 2, we arrive at the proposition: if g(z) € 
€ WY(D) and 1 > n[2-+ 1, g(x) is equivalent to a function con- 
tinuous in D. 

Now let the sequence of g(x) be convergent in W". Let us investig- 
ate these functions on any section of the domain D. Let D be the cylinder 
defined by 0 < 2, <a (a is a finite number) and (2, 2%, ..., Zn-4) 
belong to the closure # of some finite domain & of the (2,, 2, ..., 
Tn1) plane. We shall write , for the section of D by the planes 
In = const. 

THEOREM 38. Let y,(x) (kK = 1, 2, ...) be continuous and have contin- 
uous derivatives dp;,{x)/d2_, in D, and let both p(x) and dy,(x){Ox_, be 
convergent in L,(D) (p> 1). Then —;(z) are uniformly convergent in 
L,(&x,) with respect to x, of [0, a], the limit function g(x) of L,(D) ts 
defined on all sections 8, and, as an element of L(@x,), depends con- 
tinuously on 2p. We shall first prove the theorem for z, € [a@/2, a]. 
We take a function ¢(z,), continuously differentiable in [0, a], equal 
to zero for 2, = 0 and unity for xz, € [a/2,a]. It may be verified 
directly that the functions y;(x) = €(%n)pi(x) and dy,(x)/d%, are con- 
vergent in L,(D). 


On using the formula 
Xn 


) eeeyepes 
WAZ Way ns 2p) = Beals Bcie Paes) dr (149) 
c 


113] PROPERTIES OF FUNCTIONS OFSPACE W{)(D) 335 
and Hdlder’s inequality, we obtain 


{lv (%) — py (x)|P dz,...dz,_) = 
é 


Xn 


OY, (Ly, . +5 Vpn, T) Opy (Ly,0 +45 Xp_-1, 7 Pp 
=| | (mua a n--1 _ 1 n=1 Jaz (SL nn Ce 
& 


ot 
0 
ree 4 Sey 
ys pee Up, Tt see ox, ,T) — 
<2? | py (ay = n=1 bn. Ph AX, = n-1 day. .da,—, dt = 
¢€ 0 
Pp 
= || ow Oy, IP 
Pe ee ea Se . 
=o OL, OL, |\Lp{D) 


Hence it follows that y,(x) are convergent uniformly with respect 
to Zp, in the norm of L,(@;,), to some function (zx). Since ¢(z,) = 1 
for 2, € [a/2, a], we thus obtain for the 9,(xz) a limit function (2) 
in L,(@, ), defined on each section @,., if z, € [a/2, a]. This assertion 
can be proved in precisely the same way for z, € [0, a/2], and it is 
obvious that the limit function g(x), defined on all the sections @,,, 
will be the limit of the »;(z) in Z,(D) also. It remains to prove that 9(2), 
as an element of L,(%;,), is continuously dependent on 2p, i.e. to 
prove that 


pe ss Uys. . Lp 5%) x, = 6) — (%,. : -) Ln, Bq)|? Ady. 7 dz,.= 0. 


We have, in view of Hélder’s inequality: 


[lv (% ys. +65 @p—yy Ly + 9) — Py (Byy- yyy, Ey) Pda. . de, = 


<a Ap wy 


p 
| Oy, 
ae CG) 


Ly 


2 ag) (Ly, 60+) Lp-1s Lp) 
oe 
éz, Xn 


? Xnt6 3 P 
‘pi Vk 
=o | i | OXn 
er, Xn 


and in the limit: 


Pp 
|Lp(D) 


Pleleas tae Bn + 8) Wty ity a, y)|P day. dt, 4 < 
fz, 
Pp 


< oF : 
£,(D) 


a 
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whence follows the required relationship with z, € [a/2,a]. The 
same holds for x, € [0, a/2]. 

Note. The condition for continuity of the derivatives O¢,,(x)/dX, 
can be weakened in the theorem by requiring only the existence of the 
generalized derivatives from L,(D). In this case (149) will hold [110] 
for all z, € [0, a] and almost all (x,, v7, ..., %-,) of &, and all the 
subsequent arguments retain their force. The limit of d9,(x)/dz, in 
£,{D) will obviously be the generalized derivative O(x)/0z, in D. It 
follows from the last theorem and the properties of the generalized 
derivatives that, if g(x) is given say in the cube Q@ [(0 <a, <1; 
i= 1,2,...,n] and belongs to L,(Q) (p > 1) together with the 
generalized derivative d9(x)/dz,, it is equivalent to a function defined 
on every section of the cube Q by the planes z, = 6 (0 < 6 < 1), be- 
longing to L,(#,,) on these sections, and continuously dependent on z, 
in the norm of L AZ). This follows from the fact that such a function 
g(x) can be approximated [111] by functions g;,(z) continuously dif- 
ferentiable in Q as indicated in the conditions of the theorem. In 
particular, boundary values of ¢(x) will exist in the sense indicated on 
the boundaries x, = 0 and z, = 1 of the cube. 

Now let g(x) be given some bounded domain D and belong to 
WD). Further, let the boundary of D contain a smooth (n — 1)- 
dimensional piece 8S. We map the part of D adjacent to S into a paral- 
lelepiped with the aid of a transformation y = y(x), continuously 
differentiable as far as the boundary (on the assumption that 8 is 
such that this transformation is possible). If 


Ce MB ee a eee eal 2 eae) 


is the equation of S, and the points (2,, %, ...,%), satisfying a < 
<a, < PB (k= 1,2,...,2—1); 0 < %— F(X, .--,%n-) < y, be- 
long to D, we can take as the new variables y;: 


Y, = 25 Yo = Beh - FY = Uy — Flay. « «1 Sp—1). 


The parallelepiped is defined by a < y, < B (kK = 1, 2, ...,n — 1); 
0 < yn < y. In the new variables y(y) will belong to WO), where Q 
is the parallelepiped, and what has been said above will hold for it. 

In particular, on approaching the boundaries 7 of the cube Q, which 
is the image of the piece S, the value of p(y) will approach in the norm 
of L,(T) the values of g(y) on the boundary itself. This means, in the 
old coordinates, that the values of g(x) on 9 and the values of g(x) on 
“the corresponding images of the displaced surfaces’’ will be close 
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to each other in the sense of the norm of Z,(S). We can speak in this 
sense of the values on smooth (n — 1)-dimensional surfaces of the func- 
tions y(x) of wD) (and in particular, of their values on smooth 
pieces of (n — 1)-dimensional boundaries), and of the fact of these 
values being taken continuously. 

We now show that, given the conditions indicated below, the usual 
formula for integration by parts holds for functions of space W9)(D): 


D 


re ) p(a) cos tals dS, (150) 
s 


where nis the outward normal to the boundary S of the domain D. We 
assume that D can be divided into a finite number of domains D,, 
each of which is star-shaped with respect to one of its points and has 
& piecewise smooth boundary. If we can prove that (150) holds for 
each of the D,, it can be shown to hold for the whole of D by summation 
of the equation over all the D,. So let D be star-shaped and g(x) € 
€ WO'(D), p(x) € WD) (1/p + Ip’ = 1). By the theorem of [111], 
there exist sequences of functions 9,({z) and y,(x), continuously dif- 
ferentiable in D, convergent to g(x) in WD) and w(x) in WOD) 
respectively. By Theorem 3, 9,(x) —> g(x) in L,(S), and y,(x) —> y(z) 
in L,.(S). Formula (150) holds for ¢;(x) and »,(z). On passing to the 
limit in it with respect to k, we find that (150) holds for g(x) and y(z). 

The solution of boundary value problems of mathematical physics in- 
volves a discussion of certain subspaces of W9(D), consisting of elements 
satisfying given homogeneous boundary conditions. They were first 
introduced by K. Friedrichs (see Hilbert-Courant, Methoden der mate- 
matischen Physik, vol. II, ch. VII). As above, let D be a bounded domain 
of space & (2, %, ..., X,). We shall write C(D) for the set of all finite 
functions, continuous and continuously differentiable up to order J in 
D. We introduce the norm of WD) into this lineal, and we write 
W(D) for the result of closure of ae with respect to this norm. 
If g(a) € COD) and y(x) € WD) (1/p + 1/p’ = 1), we have by 
definition of the generalized piace, 


{ D* g(x) w(x) da = (— ” foe) ) D* w(x) da. (151) 
D 


On noticing that the elements of wD) are the limits of elements of 
O(D) in the norm of WD ), we can say that (151) holds for any 
g(x) € Ww(D) and (zx) € Ww? (D). Obviously, W “(D) belongs to 
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WD). It may easily be seen that WS(D) is a regular part of WD) 
(1 > 1). For suppose we take the casei = 1. We write down the 
formula for integration by parts: 


0 
[EE ve) de = — f(x) FO da + [ o(e) ola) 008 (n, 2) 8, 
D D AY 


where (zx) and y(x) are continuously differentiable in D. There exist 
in W?(D) functions ¢(x) of the type indicated, non-zero on S, and the 
integral over S will be non-zero for these, given a suitable choice of 
p(x). For any other (zx) of WD), (151) holds with k = 1, the integral 
over the surface being absent. 

Since p(x) € WD) is arbitrary, (151) really tells us that functions 
p(x) of WD) “vanish on the boundary together with their deriva- 
tives up to order J — 1”, 

If the boundary S is sufficiently smooth, y(x) and its derivatives up 
to order (1 — 1) tend to zero in the norm of L,(S) on approaching 8, 
as mentioned above. 

In the general case our convergence condition only holds in the 
sense that (151) holds for any g(x) € WD) and y(z) € WS(D). 
When considering space W)(D), the smoothness of the boundary 
does not play an essential part, since, on locating D inside some sphere 
D, and extending the definition of v(x) in D, by zero (outside D), 
we get g(x) € WY (Dy). Such an extension of functions of WD) gives 
us the possibility of drawing more complete conclusions without 
introducing domains D’ lying strictly inside D. In particular: (1) 
functions o(x) of WD) can be approximated in the norm of WD) 
by finite functions infinitely differentiable in D; (2) if (zx) € 
€ W9(D) with I> [n/2] + 1, p(x) is equivalent to a function con- 
tinuous in D and vanishing on the boundary of D. 

We conclude by showing that the closure of functions of cD) 
in WD) (p > 1), ie. in Z,(D), gives the whole of space L,(D). 
In other words, smooth finite functions form a set dense in L,(D). 

For, let D,; be the set of points not less than a distance 6 from the 
boundary of D, and suppose that, for any y(x) € L,(D): 


o® (2) = Le if x € Ds, 


0, if x € D,, 


The functions g(x) are obviously dense in Z,(D), since, as 6— 0, 


lle — 9 Iho = JS lp()|P da 0. 
D-Ds 
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We form the mean functions 9{(z) for y(z) with h < 6/2. These 
functions are smooth and finite in D, and converge to y®(z) in L,(D) 
as h-»0. Also, they form a set dense in L,(D). We have therefore 
shown that WD) and WS(D) coincide when | = 0. 


114, Embedding theorems. We now turn to a detailed discussion 
of the properties of functions of space W9(D) and establish the con- 
nection between the behaviour of the functions and that of their 
derivatives in the domain itself and in sections of it of different 
dimensions. A number of important inequalities will be obtained in 
this connection, and we shall discuss on the basis of these the question 
of equivalent norms in space WD). The aggregate of these results 
is generally known as Sobolev’s ‘‘embedding theorems’’. We shall state 
these theorems in the present section. 

We shall assume that D is star-shaped with respect to every point 
of some sphere K lying inside D, or that D can be divided by smooth 
surfaces into a finite number of domains of this type. 

It may be shown for domains of this type that, if g(x) has all the 
generalized derivatives of order 1 > 1 in D, where p(x) and these derivatives 
belong to L,(D) (p > 1), then g(x) has every generalized derivative up 
to order l, and these also belong to L,(D). In other words, the class of 
functions W"(D) is embedded in the class W{"(D) when m < l, as 
also in class WS”(D). It follows from this, by the way, that classes 
WD) and WD) consist of the same functions for domains of the 
type indicated. We agree to use for brevity the notation 


lel = llellws wy: 


It may be shown that 
i-1 
» llPlpn < Allellpy (152) 
k=l 


where A is a positive constant independent of the choice of (xz) € 
¢ WD). Inequality (152) shows that the norms of spaces W%(D) 
and WD) are equivalent, and we may not differentiate between 
these spaces in future. These assertions are contained as particular 
cases in more general theorems, to the statement of which we now 
proceed. 

The above assertions, and the theorems stated below, will be proved 
in [115—118]. 
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We introduce as a preliminary the space C"(D) of functions con- 
tinuous in D and having all derivatives up to and including order J, 
these being also continuous in D, the norm in this space being defined 
as follows: 

llellem) = max [D* p(x)], 
0K! 


where D“p(x) is any derivative of order k and the maximum is taken 
over x € D and over all possible derivatives up to order 1. We recall 
that the continuity of any given derivative D*9(x) in D is understood 
as follows: (x) has a continuous derivative D“g(x) inside D, and this 
latter can be given a supplementary definition on the boundary of D 
so as to obtain a function continuous in D. Instead of the above norm 
in C?(D), we can introduce the equivalent norm in accordance with 

t 


l7il= > > max 


k=O KypuetKky=k X€D 


a* pa) 
oak. , oka ; 


Obviously, C°(D) is a complete B space. 
We shall write C(D), as earlier, for space CD). 

THEOREM 1. If p > 1 and pl > n, every function g(x) € WD) ts 
equivalent to a function of C(D) and 


llPllewwy) < © |Ie\|wexo), (153) 


where M is a constant depending only on the domain D. Every set U 
bounded in WD), is compact in O(D). 

THEOREM 2. If p > 1 and pl <n, every function g(x) € WD) is 
equivalent to a function g(x) which ts defined almost everywhere on the 
section D, of domain D by any plane of s > n — pl dimensions, 9(2) 
being summable on D, to any degree q satisfying 


ps 


Ts n— pl 


(154) 
and where 
llellzaoy < 2, |le|lwexo » (155) 


in which M, ts a constant depending only on the domain D and the 
section D,. In addition, given any « > 0, there exists an n > 0 the same 
for all y(x) whose norms in WD) do not exceed a fixed number, such 
that 
f lela + 4x) — (a)? ds <e, (156) 
D, 
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where | Ax | < 7 and the points x and x + rdz (0 < t < 1) belong to 
D. It follows from what has been said that a set, bounded in WD), 
is compact in L,(D,). 

Notice that we can take as D, either a complete or an incomplete 
plane s-dimensional section of D, as also an n-dimensional domain 
belonging to D. If pl = n, we can take any q greater than unity in the 
second theorem. When pl < n, the right-hand side of (154) is greater 
than unity. The theorem remains valid if the plane is replaced by a 
smooth surface. 

Note. By Theorem 1, every function (zx) € WD), given 
pl > n, isalso a function of C(D), i.e. it is embedded from WD) into 
O(D). By (153), the embedding operator, associating each function of 
WD) with the same function as an element of C(D), is bounded, 
and the last assertion of the theorem amounts to the fact that this 
operator is also completely continuous. A similar remark can be made 
for Theorem 2. Let us also mention some corollaries of the embedding 
theorems. If pl > n and the integer m satisfies 0 << m <1 — n/p, any 
p(x) € WD) is continuously differentiable in D up to order m, the 
functions D* g(x) with k < m being equivalent to the corresponding 
derivatives of g(x), continuous in D, and there exists a positive 
number A, depending only on D, such that 


|D* o(x)| < Allellweoy- (157) 


Hence it follows that WD) with pl > n is the part of the space 
of functions continuously differentiable in D up to order 1 — [n/p] — 1 
(part of space O'~'"/PI-1), Tf 


m> 0, m >1——~—and s>n—(lL—m)p, 
we have on every sufficiently smooth s-dimensional manifold F, in D: 
a 
D™ p(x) €L,(F,) forg< n—(d—m)p’ (158) 
and there exists a positive number A,, depending only on D and F,, 


such that 
||D” 9(«)||L(0) < A; |lel|wan)- (159) 


The embedding operators are completely continuous in all our 
examples. 

Theorems 1 and 2 enable us to construct different norms in W9(D), 
equivalent to the basic norm (145) or (138). The next Theorem 3 gives 
a more general result in this connection. 
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THEOREM 38. Let 1,(u) (k = 1, 2, , NV) be linear bounded functionals 
in WD) such that they do not vanish simultaneously on a@ non-tdenti- 
cally-zero polynomial of degree not greater than | — 1. Then 


N 
leP= MD eles + Malo)? (160) 


tht... them 
defines a norm equivalent to the basic norm (144) or (138). 
Notice first of all that, by what was said in [112], norm (160) is 
equivalent say to 


N 
eli ed | D'e llegeoy + & |Le() | (161) 


£4+44+ ve ably l 


and to other analogous expressions. 
We shall quote some examples of equivalent norms in space W9)(D). 
It follows from Theorem 3 that we can specify the norm in this space by 


op 
} Ox, 


lell= not | Jo da. (162) 
For, the functional { p(a)de is linear in WD) by virtue of the 
inequality 


1 


1 
| f glee) da] <I] ¢ lege) (mD)™ <I] ply) (md), 
and ( cda # 0 for any constant c # 0 (here, mD denotes the measure 
2) 


of domain D). 
It follows from Theorem 2 that the norm 


n ap 
Ivll= || 


Ox, | 
is equivalent to norm (138) for any q satisfying the condition 1 <q < 
< pn|(n — p) (p < n). When p = n, we can take any q > 1. If p > 1, 
we can again take any qg > 1, or even replace the second term in (163) 
by || @ |lcw). The same remark applies to the next formula (164). 
The reader may easily verify that the expression 


+ | llzew) (163) 


Lyp(D) 


Op 


“Ox, + |i P[leecsy » (164) 


n 
! is l 7 k=1 es 
where S§ is any given smooth (n — 1)-manifold in D, and the index q 
satisfies 1 <q < p(n — 1)/(n — p)(p < n), also defines a norm equi- 
valent to (138). Similar arguments lead to various equivalent norms 


in space WD) with / > 1. 
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115. Integral operators with a polar kernel. We now turn to a study of the 
proofs of the embedding theorems, which we steated in the previous section. 
We must first discuss a class of integral operators with polar kernel. As usual, 
D denotes a bounded domain of n-dimensional space R,. Let a, be a sphere 
of unit radius in R, (of n — 1 dimensions) and | g,,| the area of this sphere. 
An elementary volume is given in R, [II; 173] by dx = da, da,... dz, = 
= r*-1drdo,, where 


do, = sin"~* @, sin" *@, ... sin 0,2 d0, dO, ... d@, 2d. 
THEOREM 1. Let B(x, y) be a kernel, bounded for x and y ¢€ D and continuous 
or x 4 y. The iniegral operator 


Biz, B(x, y) 


[ene p fw) dy (165) 


u(r) = 


is completely continuous (and hence bounded) ae an operator from L,(D) into 
C(D) for A <n/p(I/p + Vp’ = Vt 
By hypothesis, | B(x, y) | < B, where B is constant. By Hélder’s inequality: 


1 1 
Ju(x) | < BLS fy) Pay]? LS r-7’ dy”, 
D Kr 


where Kp is the sphere of radius R with centre the origin, containing D, 


and r=|2—y|. On passing to spherical coordinates with the origin at zx, 
we obtain 
Tare 2 
[ f n-¥P" dy]? <[f rt dr f doy] (166) 
Ke 0 on 
and consequently, 
onl ome 
g P’ Pp 
Ju(a) |< B(SME YP aR)” If llegoy (167) 


This inequality shows that, if || f ||, D) = CG, where C is a constant, the cor- 
responding functions u(x) are uniformly bounded in D. To prove the theorem, 
it is sufficient to show that they are equicontinuous. Let 6 > 0 be a sufficiently 
small number and D(®) the set of the points y ¢ D for which |y — z|> 4. 
We obtain for « and x + dr ¢€ D: 


B(z, y) B(x + Az, y) 
Av) — a3 . 
| u(a + Ax) eal eeu” le tde ag | I#(y) {dy + 
+B f et ay+B | lfy)| og 
D-Dio p-pw Ja+d4e—yf 4” 


where we assume | 4x | < 6/2. Given any e > 0, there exists an 7 > 0 such 
that the absolute value of the difference in the first integral is < eif| 4x |< 7. 


ft Here and below we assume p > I. 
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On applying Hélder’s inequality, we obtain the upper bound e | D |!/?’ C for 
the first integral, where | D | is the measure of D. An upper bound can be found 
for the second integral from inequality (167) with 2R = 6, whilst the third 
integral does not exceed the integral of the corresponding integrand over the 
part of the sphere 


ly— (e+ 4e)) <2, 


which belongs to D, and the upper bound of this integral is also found from (167) 
with 2R = 36/2. 
Finally: 
a ieee Leer A_y 
| u(a-+ de) —u(2)|<e[D| > O4B (ea [»” oi (+) | 
n — Ap’ 2 , 

whence, since our choice of 6 and e¢ is arbitrary, we see that the u(x) are equi- 
continuous for || f||;,4p) < @. The theorem is thus proved. 

THEoREM 2. Leé n>A> n/p’, the integer s>n—(n—A)p (or what 
amounts to the same thing, s/p > A — n/p’) and «<n. The integral operator 
(165) is completely continuous as an operator from L,(D) into L,(D,), where D, 
18 any given a-dimensional plane section and q is any number satisfying 


* a 
q<4@ n—(n—A)p- 

Assuming that B(x, y) is defined in the domain D, containing D in its interior, 
and possesses the above-mentioned properties in D,, we obtain in addition: 


[| (ar +- ax) — u(x) {repay < €(@) [If Ilza(oy » (168) 


where x° is a fixed vector, and e(a) te continuous for 0 << a< 46, where 6 is a 
positive number equal to zero for a = 0 and defined by the constant B, as also 
by the dimensions of D and D,. When s = n we can take any subdomain of D as D,. 

Note. It need not be assumed that B(x, y) is defined in a wider domain 
D,, but in this case we have to regard the displacements az® as permissible 
in (168), ic. such that the points xz +- ax® are situated in D. Notice also that, 
when » — (n — A) p = 0, q is arbitrary. 

We shall divide the proof into two parts. 

Lema 1. Given the conditions mentioned, operator (165) ts bounded as an 
operator from L,(D) into L,(D,). 

It follows from the conditions of the theorem that g* > py, and we shall 
assume for the moment that g > p. We have 


n 


8 8 8 
— > — =1A— , ie. A= — -—2 > 0). 
G 7 ry rar B (8 ) 
We put 
1 1 1 I 
s=—j; @4=—-—---; a= 7 a a a = 1. 
ay q = 7 I= > (a, + a, + a4) 
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On twice applying Hélder’s inequality, we arrive at the following inequality 
with three factors: 


finnnles<[ fini)" [fine] [fini] 
D D D D 


On applying it to the integral on the right-hand side of the obvious inequality 


1 


rate BF (ent [fe 8) (lan PD) OP) ay, 
Kez 


where f(y) is continued by zero on the part of the sphere lying outside D, we 


obtain 
1 1 1 


jue) < BI [Ifa Pray] 7] [ite Pay]? % | front? ay | F 
Ke Ke Kr 


The second factor on the right is || f 5PK. The third is subject to the familiar 
inequality 


ey 2RyP'e 
rite’ B 4 1 On | ( : 
a eB 
Kr 
whence 
ecko 12 1 
ule) < B(-Ce)F cary ll flgo§ | fifty) Prt ay] 4 
Kr 
and 


7 q 
Jireeay faa < we Leal any yisiiere,, faa! [fy Prost? dy , (169) 
Ds Ds Ke 


where the differential dx‘) refers to D,. We change the order of integration and 
find an upper bound for the inner integral of r-*+7* over D,, which is taken with 
constant y. We notice first that 


n 
p’ 
On introducing into D, spherical coordinates whose centre is the projection 


of y on D,, we get dx) = g-Idedo,, and r-5+7% < 9-5+48, since 9 < r and 
—e-+ qf <0. Hence 


fre dx) < (gues ol dodo, = | o,| (2B)? 
dD, 


—#+q8=—9(2 A) =—a(a— +B) <0. 


qB ' 
Ds 

and, by (169): 

Il IIzeeos) < (2B)°O || F Iz poy » (170) 
where 
Ie (losl Va 

C=-B (Lo)? (3+) q 171 
p’ B gB ae 


does not depend on the dimensions of D. 
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We have assumed gq > p. If we take g < 7p, it is sufficient to use the inequality 


1-4 (172) 
lle Hees < Ue [ego ! Dsl ™ (p <q <Q*) 
(| D, | is the measure of D,), which follows directly from Hélder’s inequality; 
the further factor | D, |!~%/% appears in the expression for C. 

Lemma 2. Operator (165) 78 continuous with respect to a displacement in the 
sense indicated in Theorem 2. 

As in Theorem I, 


Bia, y) B(x + Az, y) 
Ax) — u = : 
| uw + Aa) wes | p= ep pone | lw) lau + 
| Hy) | | Fy) | 
B pea AL Ee eee meee it 28) Eee, (re 7s 
: ly— ala ut } ly —(@ + Az) “4 ee 
ly-x] <6 ly-x]<e 


where, as above, | Az| < 6/2. Given e > 0, there exists an 7 > 0 such that, 
with | de| < n, 
1 1 
|| ee(a 4+ 4x) — u(x) Ines <€| DIP | Dsl | flap) +--+ 


and the row of dots is the sum of the norms of the second and third terms 
on the right-hand side of (173). We have inequality (170) for these norms, with 
2k = 6 and 2h = 36/2, so that 


\| (a + Ax) — u(2) |Izqosy < (Cre + C26") |l f iepoy » (174) 
where 
Pip Lonl\r (Lash yety , (3)¥ 
= Pint: O,= B(1oel\P (1%! \4 2 ; 


When g < 7, a further factor appears in the expression for C,. 

To prove Theorem 2, it remains to show that, given || f||z,~n) < 4, where 
A is a constant, a set of functions u(x) is obtained, compact in L,(D,). The 
u(x) are bounded in L,(D,) by virtue of (170), whilst they are equi-continuous by 
virtue of (174), if we assume that Az lies in D, and notice that the above- 
mentioned 7 > 0, which was defined according to ¢, depends on the kernel 
B(x, y), but not on f(y). Theorem 2 is proved. 

TororeM 3. Let B,(z, y) and B,(x,y) be bounded kernels for x and y € D 
and be continuous for x # y. Then the integral 


B,(a, y) Buy, z) 


I(x,2) = d A<n; <n), 175 
(= |e w<n) (175) 
given x and z € D, can be written as 

I(x,2) = Bia,2z) p{|x—z|), (176) 


where B(x, z) has the same properties as B(x, y) (¢ = 1, 2) and 
E-Gt#-") for Atu>n, 
p(é) = 41+] logé| for A+u=n, (177) 
I forAty<n. 
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Let x 4 z. We divide D into two parts: D= D,-+ D,, such that D, contains 
only the point 2 and D, only z. On choosing p’ > 1 such that A < n/p’, and 
taking f(y) = B,{y, z)|y — 2|* in operator (165), we can assert on the basis 
of Theorem 1 that integral (175), taken over D,, is a continuous function of 
(x, z). Similarly for the integral over D,. The continuity of I(z, z) for x ¥ z is 
thus proved. 

To prove (176), we only need to show that 


d 
Geared < O9(|x —z]), (178) 
D 


where C is a positive constant. 
1. Let A+ p> n. We write | x — z| = 6 and introduce the new coordinates 


/ 


ae eet: SANGRE POR 
c= 6 3 Y= 6 > id Fy 
so that | x’ — z’| = 1. Using the ordinary notation FR, for the whole of n- 
dimensional space, we obtain 


— ae < | en ee 
5 ja—y|Aly—zle Sar esa. eae ja’ —y’ Aly? — 2" fo 


To evaluate the integral over R,, we locate the origin at the point x’ and the 
vy axis in the direction from 2’ to z’. We now obtain 


i dy <6 Otu-a) f aeees eee 
|e—yPly—z|s ly’ Aly’ — 2 |# 
D Ra 


where z, has the coordinates (1, 0,0, ..., 0). The latter integral is convergent, 
since A<n, w<n and 4+ yw>n, and is obviously independent of z, z, 
whence it follows that 


<O\|x— rH a ek ae : 


Jte=sris=F 


2. The case A-+ yp =n. Let Kp be the sphere with centre the origin and 
radius f&, containing D; on introducing the same coordinates as above, we 
have 


cera Ply —z|e < | qe=atty=F Rly =F = 


"gee YF < |qwrv==F 
Ja’ —y “By 2 |e) fy’ ly’ — 20 |e 
K2R 
3 
We have doubled the radius in the last integral, but can integrate as before 


over the sphere with centre the origin. The positive 6 can be assumed suf- 
ficiently small. Let 2R/é > 2. The integral over KAR is split into two: over 
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K, and over Kp, — K,. The integration over K, gives a positive constant 
C,. It remains to consider 


oe ee: aes 
' Ly’ A] y’ — 2 |e 
Kar—Ka 
eo 
Since | y’ — 2 | > |y’| — 1, we obtain, on writing | y’| =r: 
aR aR 


n—t dr 
heleal | aapmag @ oll | Te Atp=n), 
F r 


or, since r > 2, 


<|o,| 2% log —. 


ar 2R 
r ) 


Finally, 
~——__ #4 _____ < g, +0, |1og 6| < C(1 + log |z —2/) 
ly—z[Afz—yle “oh ' 

D 


(0 >0,; C>C,). 


whence follows inequality (178) for A+ w=n. 

3. The investigation of the case 4 -+ 4 <n is basically the same as above. 
Notice that integral (178) is now convergent with z =z. Precisely as in the 
previous case, we have 


_4— dy’ 
<on te i Tiwi a 
\ye=aR er Perrier ly’ Aly’ —z, 4 
KR 
3 
2R 
° rl dr 
<6" #10, +l o,| Se ae 
rte (i) 
- 
2 
aR 


OH) 


< on fne lc, + |o,| 2" J ees ar < 
2 


= 
corto lt (2B 


i.e. we have 
Ons 
\qe=aR=aF ceria y— 


and Theorem 3 is proved. 
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116. Sobolev’s integral forms. We shall now assume that the domain D 
is star-shaped with respect to every point of a sphere K lying inside D. We take 
the centre of this sphere as origin and let R denote the radius. We introduce 
the following infinitely differentiable function [71]: 


R 
Ce FR-I¥l for ly|<R, 


Ply) = 0 tor |y| > R. r78) 


where the constant C is chosen so that the integral of p(y) over K is equal to 
unity. 

Let u(y) be any given continuous function, continuously differentiable in 
D. On introducing spherical coordinates with the origin at a point x, we can 
regard u(y) as a function of z, rand w,, where w, denotes the set of angular 
spherical coordinates [IV; 156], ie. u(y) = u(x, 7, @,), where u(x, 0, w,) = 
= u(x). Let us consider the integral of u(y) p(y) over D. The integration is in 
fact confined to the sphere K. On writing p(y) in the form p(z, 7, wn), and 
integrating by parts with respect to the variable r, we get 


i u(y) ply) dy = | [f U(x, T, Op) (2,7, Wp) 2 * ar] doy = 
D @, 0 


= —{ { {wt r, @,) dr [ on 0, @,) @"~* ae] do, = 


Oa, OO r 


0 


fff 225% freon 


Oy 


= — | [ce 1, ®p) fre, Q, Mp) o> de" da, + 


co 


== u(a)+| | lea eee (te 0, ®,) @"”* 7 dr|da, 
a, 0 r 


and finally 
* Gu(a, 7, 1 
[ow Ply) dy = u(x) — | oneal Gat B(x, y) dy, (180) 
D D 
where 
Bix, y) = — § pla, e, oy) oe" *de, (181) 
r 


and the point y corresponds to the spherical coordinates (r, @,) with the origin 
at z. The function B(x, y) is bounded, if z and y € D, and is continuous for 
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xx#y.Ify— x along a straight ray, B(x, y) has the limit 


~ i) p(2, 0, @,) e"! do, 


depending on the angular coordinates of the ray. It follows at once from the 
definition of B(x, y) that, if « belongs to the sphere K, B(x, y) = 0 for y belong- 
ing to D and lying outside K, whilst if z is outside K, B(x, y) = 0 for y belonging 
to D and lying outside the domain formed by the sphere K and the part of the 
cone with vertex x tangential to the sphere, which lies between x and the 
sphere. On using the formula 


duty) _ ye Buty) duly) 


pee By cos (r, y;) = yy Suv) Quy) yea or 


i=l Oy, ly—a]’ 


we obtain by (180): 


mn 
a 
ue) =U + 3 ae ele ay (182) 
D 
where 
U= [uty ry) dys By(e,y) = Ble, y) OZ (183) 
D 


The kernels B,(z, y) obviously have the same properties as B(x, y): they 
are bounded for z and y € D, are continuous for z ¥ y, and vanish in the above- 
mentioned part of D. 

Now suppose that u(y) is continuous and has continuous derivatives up 
to order J in D, and let us deduce an expression for u(x) in terms of its ith 
order derivatives. Using (182), we can write 


duty) _ uz) — Bxly,2) 
Toy OF BS carteg Ty 2h ats 


where 
—z— Ply) dy . (185) 


On substituting (184) and (182) and using Theorem 3 of [115], we obtain 


n n 0 u(z) Bylaw, 2) 
where 
ns B, (x,y) — Bue.2) B,(x, y) Byly, 2) dy, (187) 


ja—y|r *’ Foe eee eae ee 
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the kernels B, ,(x, z) have the same properties as B(x, y), and, by Theorem 1 
of [115] (with f(y) =1), the b,x) are continuous in D. The expression for 
u(x) in terms of the /th order derivatives is similarly obtained: 


n 


ua)=U+ by cay te Blas. oy te (@) + 
ISk<I-1 h,..., k= 
Guy) Bay. (%Y) 

lptees y d ; 188 

Ds oS | OYiy» +++) OY bay est H vee 

where 6,,, ..., 4 (2) are continuous in D and the kernels B,,,. . .,4, (2, y) have 

the same properties as B(z, y), 
oF uly) 
OL | oe dy , 189 
n= | gy tye Pw oY (189) 
D 

and the summation over ?,, ..., 4, is a summation over all the types of derivatives 


of order 1, Notice that, when n <1, the kernels in (188) are bounded [115]. 
It should also be noticed that the U;,,.. .,,, are linear functionals in L,(D). 
This can be seen by writing them in the fen 


ok 
Hoge fuw = dy. (190) 
: <a 


We now turn to a consideration of space f(D). We recall that the norm 
in it is given by 


af 
IL Boy = Hele fio |, a ol tera ” dy (191) 


(one of the equivalent norms [112]). Let us show that integral form (188) 
holds for any function u(x) € WD). Let u(x) be a sequence of functions of 
COD), convergent to u(z) in W(D) [111]. We write down (188) for the 
Um(x) and assign form (190) to the Uj,,...,4, then pass to the limit as m — oo. 
By Theorems 1 and 2 of [115], the integral operators with kernels Biyevoty (2, Y) 
| « — y|~~ are continuous in L,(D), so that we are able to pass to the limit 
under the integral sign. Therefore, integral form (188) is valid for any functions 
of WY)(D). We now show that functions of WUD) have all possible generalized 
derivatives of any order k <1 from L,(D). On ‘applying (182) to the derivative 
of u,,(x) of order (J — 1), we obtain 


al 


o7* um(z) (m) f of um(y) B, (x,y) 
— ae ae =O ECO y,—" 192 
OX), ++. Oi, Moan DB Pp Gyr, --- Oye, Oy, [w—ylmr 4 ue 
where 


af i et—1 
um f= dy = (— pir a) qe ' 
ity edt = J bun tq Pp (y) dy = (— I) ee Si. Oka 
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By Theorem 1 or 2 of [115], depending on whether n — 1 < n/p’ orn —1> 
> n/p’, we can say that the integral operator on the right-hand side of (192) 
is continuous as an operator from L,(D) into C(D) or L,(D), i.e. always into 
L,(D). By hypothesis, the w(x) are convergent in the norm of W§)(D) to u(x); 
hence it follows that, as m— oo, the right-hand side of (192) has a limit in 
L,(D), so that u(x) has generalized derivatives of order (J — 1) of L,(D) 
in D, for which (192) holds with u,,(z) replaced by u(x) and UY. Sti by 


bs p(y) 
atte ay {-1 Cre lems Ad Ae, dy. 
Uiy.. ta = (— 1) i (y) Oyi, ... PYy-1 2 


On using the integral forms for the derivatives of any order k <1 in terms 
of the derivatives of order /, similar to (188) and (192), we can establish in the 
same way the existence of all possible generalized derivatives of any lower 
orders from L,(D). These integral forms can now be automatically extended 


to the whole of WD). Notice that the corresponding integral operators have 
kernels with polarity of order n — (1 — k). The boundedness of these integral 
operators in L,(D) leads directly to the inequality 


ku (a) | [ | atu (x) 
NE C _ Guz) 
| OEh, ...5 OFty | < ee = Bay, .- Oe, | | (193) 
Ly(D) dpeeey OB L,(D) 
whence (152) follows. 


P 

We can now turn form (189) back to (190). We have thus shown that, for 
domains star-shaped with respect to a sphere, spaces WE) and Wi) consist of the 
same set of functions, and by (193), the norms in these spaces are equivalent.t 

The integral form (188) of functions of W{)(D) was obtained in asomewhat 
different form by Sobolev (see 8. L. Sobolev, Some Applications Of Functional 
Analysis To Mathematical Physics (Nekotorye primeneniya funktsional’nogo 
analiza v matematicheskoi fizike), LGU, 1950. 

We shall next prove the general embedding theorems stated in [114] for 
domains star-shaped with respect to a sphere, and then show how the theorems 
may be extended to a wider class of domains. Our assertion regarding the equi- 
valence of W{)(D) and WD) will thereby also be extended to a wider class 
of domains. 


(k= 1, 2,..,2—1). 


117. Embedding theorems. Let us return to the integral form (188). The 
terms outside the integral on the right-hand side of (188) are completely continu- 
ous operators from WSXD) into C(D). For, any function w(x) € wD) is 
mapped by such an operator into the same function },,...,;,(z), continuous 
in D, multiplied by the number U,,,...,;,, representing a functional continuous 
in WO(D). If the set of functions u(x) is bounded in WSXD), we can extract 
from the numbers U,j,,...;,, @ convergent sequence. The corresponding se- 
quence of functions Day yevesty Oye rlZ) is uniformly convergent in D. The- 
orems 1] or 2 of [115] are applicable to the integral terms of (188). We thus 
obtain at once the following two embedding theorems. 


+ In future, we need not differentiate between WED) and WD). 
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_ Tueorem 1. If pl >n, every function u(x) ¢ WY\D) is continuous in 
D (WOXD) < C(D)) and the embedding operator is bounded: 


(D) (K > 0 is constant) 


max | u (x) | = || 4 [ley < K || uM ye 
D 
and completely continuous, i.e. tt maps every set of functions, bounded in 
WO(D), into a compact set in C(D). 
THEOREM 2. Let pl <n and e>n— pl. Then functions u(x) of WD) 
belong to L,(D,) on any plane s-dimensional section D,, for any 


ps 

i aera a (194) 
The embedding operator from W§)D) into L,(D,) 1s bounded and completely 
continuous. As an element of L,(D,), u(x) 18 continuous in the metric of L,(D;) 
with respect to a parallel displacement of the section D,, if the latter is permissible. 

Note 1. Functions u(x) of W9)(D) are defined apart from equivalent 
functions, and the statement of the theorem regarding the behaviour of u(x) 
on sections refers to a certain choice of the class of equivalents ‘cf. 113}. Notice 
that, as follows from the above, (188) defines precisely this type of function. 

2. If we replace L,(D,) in Theorem 2 by L,,(D,), it can be shown that the 
theorem remains valid up to and including the word ‘‘bounded”’. 

3. If we take an s-manifold 7, in D (it may lie on the boundary of D), which 
may be mapped (at any rate piece by piece) into a plane with the aid of 
an l-times continuously differentiable and uniquely reversible change of vari- 
ables y; = ¥;(%1, 2, --+,%,) (= 1,...,), Theorem 2 remains valid when 
D, is replaced by T7,. It is required here that the change of variables be defined 
in some n-dimensional neighbourhood of 7. 

We have already remarked that the generalized derivatives of order m < 1 
of functions u(x) ¢ WY)(D) are expressed by formulae analogous to (188), in 
terms of the derivatives of order J with the aid of integral operators with 
polarity of the order n — (1 — m). On applying the theorems of [115], the fol- 
lowing theorems are obtained as above: 

THEOREM 3. If pl >n and 0<m <1 — n/p, the generalized derivatives of 
order m of functions u(x) ¢ WYXD) are continuous in D, and the embedding 
operator from WYD) into CD) is bounded and completely continuous. 

THEOREM 4. If m > 1 — n/p and s > n — (l — m) p, the generalized deriva- 
tives of order m of functions u(x) € WD) belong to L,(D,), on 8-dimensional 
plane sections D, of the domain D for any 


*_ ps 


where 


(a) || D™u || L(D,) < K || u || (196) 


wD)? 
(b) for any set bounded in WED) of functions u(x), the set D,, u(x) 18 compact 
in D(D,); 
(c) the functiona D,, u(x) in the metric of L,(D,) are continuous with respect 
oa permissible parallel displacement of D,. 
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The same remarks can be made on Theorem 4 as on Theorem 2. 

We now turn to the proof of Theorem 3 of [114] regarding equivalent norms 
in WD). Let us recall the proposition. 

Let the linear functionals l,(u) (k = 1,2, ...,.N), bounded in WUD), be such 
that they do not vanish simultaneously on a non-identically-zero polynomial of 
degree not higher than | — 1. Then the norm in WD), defined by 


ele= 2 ae 


N 
Fan Ba omy | 8 + 5 leo P (197) 
4a equivalent to norm (191). 

Proor. Obviously, norm (197) has an upper bound expressible in terms of 
norm (191), since the functionals J,(u) (k = 1, 2, ...,.N) are bounded in the 
latter norm. 

We must prove the reverse inequality. For the moment, let us simply write 
|| w |[ for norm (197) and |] u llwOo) for norm (191), equivalent to norm (145). 

We have to show that 


4 hy! oy <4 lle (198) 


for all functions of W{)(D). 

We assume the opposite, i.e. that there exists an infinite sequence of positive 
numbers A,, (m = 1, 2,...) and elements u,,(x) of WOD), such that A,, > 
+ -- co as m-> oo, and 


Il em lly 0) > An || Urn ||- (199) 
By introducing constant factors into the u,,(z), we can assume that 
| Um | yo) =1. (200) 


It follows from (199) and (200) that || u,, || + 0 as m — 0, so that all the 
generalized derivatives of order / of the u,,(x) are convergent to zero in L,(D). 
By (200) and Theorem 2, the sequence u,,(x) is compact in L,(D). We extract 
from it a convergent subsequence, which we again denote by u,,(z), and let 
Up(x) + Ug(x) in Ly(D) as m > oo. It now follows from Theorem 2 [109] that 
all the generalized derivatives of order ! of u,(x) exist and are zero. Notice also 
that the u,,(z) are convergent to «,(x) in the sense of norm (191). We now show 
that u(x) = 0. We take a strictly interior subdomain D’ of D. Given suffici- 
ently small h, the derivatives of the mean functions u,,(x) coincide in D’ [109] 
with the mean functions of the derivatives D! u,(x). It follows at once from this 
that all the derivatives of order J of the mean functions w,,(z) are zero in D’, 
and hence u,(”) are polynomials in x,, ..., %, of degree not higher than J — 1. 
Since the set of such polynomials forms a subspace in L,(D’), and u,,(x) + %,(z) 
in L,(D), we see that u,(x) is a polynomial of degree not higher than / — 1 
in any interior subdomain, i.e. everywhere in D. We now observe that, since 
the functionals 1,(u) are continuous in WD), iy(um) +> i(uo) a8 m— co 
(k = 1,2, ...,.N). At the same time, it follows from (199) and (200) that 
L(t) + 0 as m— co (k = 1, 2,...,.N). We thus find that 1,(u,) =0 (k = 
= 1,2,...,.N), and u,(x) = 0 by hypothesis of the theorem. But this last 
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contradicts (200) and the convergence of u,,(x) to u,(”) in WOOD). The theorem 
is proved. Some simple examples will now be given of the use of the above 
theorems. 

(1) Let w(x) € WD), ie. u(x), together with its generalized derivatives 
up to and including order 2, is square summable over D. Given n < 3, it fol- 
lows from Theorem 1 that u(x) is continuous in D. This assertion may be false 
with n > 4. 

(2) Norm (162) in WD), that we obtained with the aid of Theorem 3 in 
[114], leads with » = 2 to the familiar Poincaré inequality: 


n 8 2 2 
fu (y) dy < B J x (4) dy +([ « (») dy] | (201) 
D D k=l y D 
(3) Similarly, norm (164) gives with p = 2 and q = 2: 
fu {(y) dy <C lJ x ol dy + [uw (y) a (202) 
D D k=l Ss 


Here § is a sufficiently smooth (n — 1)-dimensional manifold in D. In parti- 
cular, if the boundary of D is piecewise smooth, it can be taken as the manifold 
S in (202). In this latter case (202) is the familiar Friedrichs inequality. 


118. Domains of a more general type. We must now consider how to carry 
over the embedding theorems to a wider class of domains. Let the bounded 
domain D be divisible by piecewise smooth (n — 1)-manifolds into a finite number 
of domains, in each of which everything proved in [117] ts valid. Then it 28 also 
valid in D. Obviously it is sufficient to consider the case when D is divided 
by a surface S, into two non-intersecting parts D, and D,. Let u(x) € WD). 
We must show first of all that u(x) has in D all generalized derivatives of lower 
orders of L,(D). Obviously, u(x) ¢ WYK), where K is any sphere lying in 
D. On applying what was said in [116] to this sphere, it will be seen that u(x) 
has in K all possible generalized derivatives of the form y(x) = D* u(x) (0 < 
< k <1) belonging to L,(K). The function x(x) is defined everywhere in D 
and belongs to L,(D). For, u(x) has in D, and D, a derivative D* u(x) from 
L,(D,) and L,(D,) respectively. Since the generalized derivative is unique, we 
can say that x(x) coincides with D* u(x) in D, and D,, so that x(x) € L,(D). 
We now show that x(x) is the generalized derivative D* u(x) in D. Let D’ be 
an arbitrary strictly interior subdomain of D and 6 > 0 the distance of D’ 
from the boundary of D. Let z be an arbitrary point of D’ and 0 <A < 6. 
The function u(y) has the generalized derivative D* u(y) = y(y) in the sphere 
of radius 6 with centre at the point x. On forming the means with averaging 
radius h, we can say [109] that D* u,(x) = y,(x) at the centre of the sphere. 
Since u(x) —~ u(x) as h-> 0 and D* u,(x) +> x(x) in L,(D’), by the second 
definition of generalized derivatives [109], y(z) is the generalized derivative 
D* u(x) of u(x) in D. It remains to recall that x(x) € L,(D). 

Note. Our discussion shows that functions of w!)(D) have in any domain 
D generalized derivatives of all lower orders of L,(D’). 
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We now consider the question of carrying over Theorem 1} of [117] to our 
domain D. Suppose that n < pl, so that u(x) is continuous in D, and D,. We 
must show that u(x) is continuous in D; for this, it is sufficient to show that 
u(x) is continuous at points of the surface S,. The continuity of u(x) at any 
interior point of D follows from the theorem for embedding Ww) into C, applied 
to a sufficiently small sphere. As regards points of S, lying on the boundary 
of D, since u(x) is continuous in D,, the boundary values of u(x) taken over 
any path lying in D, are the same as the boundary values obtained along a 
path in S,. The same can be said of the boundary values on approaching from 
D,. Hence it follows that the boundary values of u(x) are the same over any 
path, and u(x) is continuous in D. The complete continuity (and hence the 
boundedness) of the embedding operator follows from the fact that, given a set 
bounded in WD), we can first extract a sequence convergent in C(D,), then 
extract from this a subsequence convergent in C(D,). This subsequence is obvi- 
ously uniformly convergent in D. 

It can similarly be shown that Theorem 3 of [117] can be carried over to the 
case in question. No difficulty arises in the extension of Theorem 2 [117}. 
We only have to observe that an s-dimensional section D, in general also splits 
into two parts: D, = D’,-+ Dj, where D, = D, + D,, Dj = D, + D,. On applying 
Theorem 2 of [117] to D, and D,, we obtain 


Hele coy <i wa + He leon < ET 4 wey + Il llwe wy] < 


< 20 \| u Il (p)- (203) 


This establishes that the embedding operator from WD) into L,(D,) is 
bounded. We can similarly extend our statement about the strong continuity 
in L,{(D,) of functions of WD) as regards a parallel displacement of the sec- 
tion D,. The complete continuity of the operator of embedding from W9)(D) 
into L,(D,) follows at once from (203) and the strong continuity with respect 
to a displacement. 

No new arguments are required for the extension of Theorem 4 of [117]. 

Theorem 3 of [114], regarding equivalent norms in WD), also remains 
valid, since the proof given in [117] was based solely on the embedding theorems. 

We can now say that, if each of the partial domains composing D is star- 
shaped with respect to a sphere, all the embedding theorems discussed in 
[114] and [117] are valid for D. 


119. Space OD. Let D be a finite or infinite domain of space 
R,, and GD) the set of all finite functions g(x) continuous in D 
and having continuous derivatives up to order J in D. Obviously, 
C(D) [114] is a linear space. As in the case of CD) [114], we 
introduce the following norm into it: 

Il ¢ lca(o) = max | D® ¢ (x) |. (204) 


x€D 
9Sksl 


119} SPACE CD) 357 


The closure of G”(D) with respect to this norm leads to a B space 
which we shall write as (“(D). The elements of this space are bounded 
functions, continuously differentiable in D up to order 1, where the 
functions themselves and their derivatives vanish on the boundary 
of D. 

Spaces C'”(D) with different | = 0,1,... are naturally embedded 
in each other: GD) c OD) if 1, > 1, the set of elements of 
former space being dense in the latter. This is easily proved by employ- 
ing an averaging process. 

Let us consider the space conjugate to C(D), which we denote 
by U“(D) (the space of linear functionals for C(D)). It is easily 
seen that U“)(D) c UD) for 1, > 1. 

We can take as examples of elements of UD) the functionals 
defined with the aid of a kernel summed over D by the equation 


(m, p) = § (x) y (x) da. (205) 
D 


Their norm satisfies the inequality 


lis Sige) ae: 


Such functionals are often described as being of the function type, 
and are identified with the kernels defining them. The remaining 
functionals are described as generalized functions. Functionals of the 
function type do not exhaust the whole of U“(D). For instance, the 
functional 6(x — zy), defined by 


(6 (% — 2%»), ¢) = 9 (9), (206) 


where x, is a fixed point of D, cannot be written in form (205) with 
a kernel summable over D. The kernel corresponding to this functional 
is said to be a delta function 6(2 — x), concentrated at the point 25, 
and we write 


§ 6 (@ — x) @ (x) dx = @ (a). (207) 
D 


But 6(% — 2) is not a function in the ordinary sense. As we shall 
show, however, the elements of U"(D), expressible in form (205) 
with a piecewise continuous kernel, are dense in U(D). To be more 
precise: 

THEOREM. The functionals of the function type with a piecewise 
continuous kernel are dense in CD) in the sense of weak convergence 
of the functionals. Let us prove a preliminary Jemma: 
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Lemma. Given e > 0, there exists for any fixed element g(x) € C(D) 
a domain D, lying strictly inside D such that 


max |DMg|<e. (208) 
x€D—De 


By the definition of G"(D), there exists a sequence gp(z) of 
elements of C“(D) which tends to g(x) in the norm (204), i.e. there 
exists a subscript m such that, given any p > 0, 


|| Pre — Pme+p || Dd) < €. 


The function 9,,,(x) differs from zero only in a domain D, of the 
type considered above, so that it follows from the last inequality 
that, for x € D — D,: 

max |D® om,+p(z)|<€ 
x€D—Dg 
O<k<i 
and we get (208) in the limit as p— oo. 

We turn to the proof of our theorem. Let m € U(D). We take 
the averaging kernel ,(| 7 — y |). Obviously, given an x belonging 
to a bounded domain D,, lying strictly inside D and at a distance of 
not less than 2e from the boundary of D, w,(| x — y |), regarded as 
a function of y, belongs to C“(D), so that the function 


is defined for x € D,. In view of the infinite differentiability of 
w,(|% — y |) and the continuity of the functional m, it will also have 
continuous derivatives of all orders with respect to z. We now define 
for all x € D the piecewise continuous function 
in, a= mM, (x) for «€D,, 

0 for x€D—D,, 


and take the sequence 9 = Q, 0, -.., tending to zero, the D,, being 
assumed chosen in such a way that D,, Cc Da, and the D,, tend in 
the limit to D. We shail show that the functionals m,,, defined by 


(titer, P) § tg, (x) p (a) da = { mo, (x) p (x) dz, 
D Do, 
tend to our functional m. For this, we take an arbitrary function 


g(x) € CD) and show that the difference 
T, = (mM, Y) — (79,5 p) = (Mm, Pp) — (i (m, We, ( |\z—y | )y (z)dx (210) 
Dy 


k 
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tends to zero as k-» co, whence the theorem will follow. By our 
lemma, given « > 0, we can find a bounded domain D,, strictly 
inside D, such that (208) is satisfied for g(x). We shall assume k 
so large that D, < D,,. We write 


Px (y) = J P(z) we, (|u — y|) dz. 


Do k 


The Riemann sums for this integral: 


I (y) = > (Es) Oy, (1Fs—¥]) 45%: 


where 4, z is the measure of a subdomain and &, belongs to the sub- 
domain, are uniformly convergent in y, along with their derivatives 
with respect to y up to order J, to the function ,(y) and its corres- 
ponding derivatives on indefinite subdivision by the 4,z, due to the 
continuity of g(x) and the infinite differentiability of «,,(| 2 — y |) 
with respect to y. But, since the functional is distributive and con- 
tinuous in norm (204), we have 


> (m, ¢ (E,) a (ifs — ¥|) 45%) = Cm, S v (Es) a, (1Fs — ¥|) 4s), 


8s 


and in the limit 


f(r, @ (@)— (| — y|)) dx = (m, fp (2) a (|e — y|) da), 
D 


ee Do 


so that expression (210) for 7, can be written as 


re = (m, 9 (y) — J v(x) my (|x — y|) de). (211) 
Da 


The function ,(z) is not the usual average of g(x) with kernel 
w,({|%—y|), since the domain of integration D,, cannot contain 
the whole of the sphere |x — y| < gx, if y € D. But if y belongs to 
some strictly interior bounded subdomain of D, the sphere |x — y| < 
< ox will belong to D,, for all sufficiently large k, and for such y, 
px(y) is the average of g(x) with kernel w,(| 2 — y |). It is clear that 
gly) € CO'(D). Let us show that o,(y) are convergent to g(y) in 
norm (204). For this, we take 6 > 0 less than the distance from D, 
to the boundary of D, and write D,; for the domain obtained by 
associating with D, all the spheres of radius 6 with centres in D,,. 
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For y € D, and all sufficiently large k, D; c D,, and 
%x(y) = J$ v(x) wy(|x—y|) de. 


Ix-yl Nee 

The 9,(y) are therefore convergent, uniformly with respect to 
y € Ds, to o(y) as k—> oo, and the same for their derivatives up to 
order J with respect to y [71]. If y € D— D,, both g(y) and o;(y) 
are uniformly small together with their derivatives up to order I, 
since (208) holds for g(y) when y € D — D,. Thus ;({y) are convergent 
to m(y) in norm (204), and it follows from (211) that 7, 0 as kK» ©. 
The theorem is proved. 

We could have shown that functionals of the function type with 
smooth kernels are also dense in U"(D). 

Let us now define the operation of multiplication by a function 
a(x) and the operation of differentiation 0/da,, for elements of U(D). 

Let a(x) € OD). If D is an infinite domain, we assume that a(z) 
and its derivatives are bounded. We introduce the linear operator 


Ag = a(x) 9 (x), 


which maps g(x) € C(D) into a(x) p(x) € G®(D), where s = min (J, 7). 
The operation of multiplying an element m of U(D) by a(z) is 
defined as the operator A* conjugate to A, i.e. we define it by the 
equation 
(m, Ag) = (A* m, Y), (212) 

which has to be satisfied for all g(x) € O° (D). 

We can now apply the operator A* to an element of U™(D) and 
A*m € U\(D). If the functional m has the form (205) with kernel 
summable over D, we have 


(m, Ap) = fp (2) (a(x) ¢ (a) de = f(y (e) a (2)] @ (x) de, 


i.e. A*m is also a functional of the function type with kernel a(z) p(x). 
We now consider the differentiation operator 


Bp = — ©) | 


It is a bounded operator from C'(D) (1 > 1) into C’-?(D). The 
conjugate operator B* is also defined by an equation of form (212): 


(m, Bg) = (B* m, Y) (213) 
and maps an element m of U“~-(D) into U'(D). This equation 
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shows that, for a functional m expressible in form (205) with con- 
tinuously differentiable kernel p(x), the kernel Oy(x)/dx, corresponds 
to the functional B*m, since 


4) 
(eRe | BG\ Oem | ae IEP aye) 
D 


Let us consider how to evaluate operators A* and B* of a functional 
dependent on a 6-function, i.e. on functional (206). 
Let o(x) € C(D) (1 > 1). Then 


(6 (% — 29), Ag) = (4* 6 (% — Xp), ¢) = A (Xo) Y (Xp) 
(214) 


(8 (@ — a), Bp) = (B* 6 (2 — %), 9) = — oe xan 


We can introduce successive differentiation, where the result is 
easily shown to be independent of the order of differentiation. We can 
hence define derivatives up to order / and various differential operators 
for elements of U‘(D). The same problems can be posed for these 
operators as for ordinary functions; viz. Cauchy’s problem and various 
boundary value problems. Generalized functions were first introduced by 
Sobolev when solving Cauchy’s problem for linear hyperbolic equations 
(1936). This extension of the class of entities proves useful from two 
points of view: firstly, it may happen that a problem has no solution 
in the class of ordinary functions, but has a solution in the class 
of generalized functions (functionals). Secondly, it is sometimes 
easier to prove the existence of a ‘“‘poor” solution, represented by a 
generalized function, and then to investigate the question as to 
whether this generalized function is an ordinary one. 

Both these circumstances become clear in the example of Cauchy’s 
problem for various systems of partial differential equations with 
constant coefficients: 


Ou; (x, t) ae 5 OMy (a, &) 
Ee UF, + Say uy (et) + h(a (215) 


On applying Fourier’s transformation with respect to 2, this prob- 
lem reduces to the Cauchy problem for a system of ordinary differen- 
tial equations with constant coefficients, depending on the numerical 
parameters a,. Application of the inverse Fourier transformation 
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enables us to pass from the solutions of these ordinary equations to 
the solutions of the initial problem. If we remain within the framework 
of classical Fourier transformations, we are confined to considering 
only initial functions and terms f/f; decreasing in a definite sense. 
The subject was investigated by Petrovskii with this assumption. 
But it was shown from supplementary arguments in this author’s 
works that there exists a class of so-called hyperbolic systems, for 
which the solution at an arbitrary point of space (2p, t)) is defined by 
the values of the initial functions gz) only in some bounded part 
of space z, depending on the point (Zo, f)). It was thereby established 
that problem (215) for hyperbolic systems is uniquely soluble for 
any behaviour of ;(x) with indefinite increase of | 2 |. 

Further, it has been shown in works of Tikhonov, Ladyzhenskaya 
and Edelmann that the initial functions for so-called parabolic 
systems can be taken as not merely non-decreasing as | z|—> ©, 
but even as indefinitely (exponentially) increasing. But definite 
restrictions have to be imposed in this case on the order of growth 
in order to preserve the uniqueness theorems. Finally, as regards 
the inverse Cauchy problem for the heat conduction equation (i.e. 
the Cauchy problem for the equation 0u/6t = Au, solved down- 
wards in ¢), it was known to have an ordinary solution only with 
special initial data. 

All these facts required a more careful study of problem (215) 
and in connection with this, of the Fourier transformations of functions 
with arbitrary behaviour at infinity. This study was initiated by 
Schwartz and performed in detail by Gel’fand and Shilov. The Fourier 
transform of a function increasing at infinity is in general no longer 
a function, but a functional in some 0(R,). It is a matter of further 
considering the Cauchy problem for systems of ordinary differential 
equations in the class of these functionals, then passing from these 
to the solutions of problem (215), which prove in certain cases to be 
ordinary, and in others, generalized functions. We shall not give 
the results here of these investigations, as carried out by Gel’fand, 
Shilov and their pupils, but refer the reader to the works of these 
authors on generalized functions and their applications.t 


However, we shall mention here some facts regarding the solutions in func- 
tionals of linear second order equations of the elliptic, parabolic and hyperbolic 


{ Three parts of a substantial work by Gel’fand and Shilov on generalized 
functions and their applications have just appeared. 
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types with variable but smooth coefficients. We take one of these equations: 
E(u) =f (2), (216) 


in which f(z) is a function having a singularity at x = x,. For elliptic and para- 
bolic equations, all the solutions of (216) prove to be ordinary functions, smooth 
everywhere excepting possibly at z,, where they may have a singularity. 
Taking the example of Laplace’s operator, we shall show below that, even if 
f(z) is a delta-function, concentrated at the point x), the solutions of (216) 
are still ordinary functions, having only a polarity at x,. If equations (216) 
are taken to be homogeneous, all the solutions are ordinary functions. 

This is not the case for hyperbolic equations. For these, the singularity of 
j(x) is extended over entire domains, and the solution may prove to be a gener- 
alized and not an ordinary function. An example may be given. We take the 
wave equation Uy, = Uy, x, + Ux, x, + Uxgx, One of its solutions is given by 
Poisson’s formula [II; 71]: 


ann 


u (a, ) =a. | | e@+en)sinododg. (217) 
00 


We know that, given a thrice continuously differentiable g(x), this formula 
yields a twice continuously differentiable solution of the wave equation, satisfy- 
ing the initial conditions u(z, 0) = 0; u,(z, 0) = g(x). Let g,,(z) be thrice contin- 
uously differentiable non-negative functions, tending to the function (x) = 
== 1/(xt + 2) as m — oo. It is clear from (217) that the corresponding solu- 
tions u,,(z, t) will tend to -+-co for (x, t) lying in the domain \2? + 23 < ¢t of 
the half-space ¢ > 0. This tells us that no solution in the form of an ordinary 
function exists for the wave equation corresponding to the initial conditions 
u(x, 0) = 0, u, = (x, 0) = 1/(x} + 23). Nevertheless the solution is unique in 
the class of functionals. Similarly, by using Kirchhoff’s formula, it can be 
shown that the non-homogeneous wave equation with right-hand side f(z) 
equal to 6(z — 2,) also has no solutions in the class of ordinary functions, but 
has them in the class of functionals. 

We propose to consider all these questions in more detail in a later volume. 
As already mentioned, Sobolev’s works on the Cauchy problem for hyperbolic 
equations were the first to pose and solve problems in the class of generalized 
functions. 

We shall conclude by proving the above assertions regarding the solutions 
of Laplace’s equation. 

Let D be a bounded three-dimensional domain and | x — x, | = r the distance 
from the variable point x to x». Assuming the boundary of D to be sufficiently 
smooth, we obtain for p(x) € G(.(D) by applying Green’s formula: 

1 ¢ Ap (x) 
P (%y) = — —|“~ dx 
D 
or 


[(— ag) sp @ az = 5 @—2) ote), 
D 


4zr 
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whence it is clear, on the basis of the definition of the derivative of a functional, 
that the functional m, with kernel (—1/4zr) satisfies the non-homogeneous 
Laplace equation: 
em, , Om, 
= d(a— 21 
Gx} 2 er Pa oa + ae (% — 2%). (218) 


We have thereby shown that one of the solutions of (218) is a functional 
of the function type. To obtain all the solutions of (218), we only need to find all 
the solutions in functionals of the homogeneous Laplace equation: 


om 02m 02m 
We must show that all the functionals m, satisfying this equation, are 


expressible in terms of a kernel, and that these kernels are functions harmonic 
in D. Equation (219) is equivalent to 


(m, Ap) = 0 (220) 
for any function (x) € G)(D). We take the following function as (x): 
1 | (4) (=) 
ir = , 221 
ea 4n|a—y | , Or . Oe oe 


where tp(¢) is a non-negative infinitely differentiable function, equal to 1 for 
£ € [0, 1/2] and zero for > 1. If the point y lies inside D, p(x) ¢ (4D) 
for sufficiently small @, and g,. The function 


0 fora=y 


(je—y|)= 1 jz—y| 
4, | ———-——— p| +21 1) for |e —y! >0, 
se ares iad eartame |code 


equal to zero for |x —y|< 9/2 and for |x — y|> @, can be taker as the 
averaging kernel, since 


fog(le—yl) de = | — 


Ix-y¥l Se 
=a a le vir] ds — J — las ” (7) |r = , 
and — 
uy (le —y|) = x(2=41), 
where 


3? 2 90 1 
10) = (arte ae) Lae? ©]: 
On substituting function (221) for p(x) in (220), we obtain 


0 = (m, @,, (|e —y|) —%(|% —y|))- (222) 
This can be rewritten in accordance with notation (209) as 


Mo, (y) = Mos (y) 
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for any y whose distance from the boundary of D is greater than max (9Q,, 0). 
We take the set V, of all g(x) of CMD), equal to zero outside some domain 
Ds; at a distance greater than 2 max (0,, @,) from the boundary of D. We have 
for such (x): 

J (y) me, (y) dy = (1,9) = J Py) mM, (y) dy = (7p, 9). 

Da Ds 

On the other hand, it has been shown that (™m,.) p) > (m,p) as oe, > 0. 

Consequently, for all oe, < 6/2 and the g(x) taken by us: 


(m, p) = (Tey, p) = § p(y) me, (y) dy (k= 1, 2), 
Ds 


i.e. the functional m is specified by the kernel 


tye 0 for r€D— Dg,, 
i a ae Me, (x) for x € Dey. 


Let us show that m,,(x) is a harmonic function in D. In fact, it follows from 
(220) that, for g(x) € V5: 


O = (m, 4,9) = f mp (x) 4, @ (ve) dx = f p(x) Aym,, (a) dz, 
Da Da 


and, since V, is dense in L,(D,), we have for x € Ds: 
A,m,;(x) = 0. 


Since the number 6 > 0 was taken arbitrarily, and since m,,(x) = m,(z) 
for 6’ < 6 and x € D;, we can say that the family of harmonic functions m,(x) 
defines a function m(z) harmonic in D, coinciding with m,(x) for x ¢€ D,. This 
harmonic function in fact generates the functional m required, since, if we 
take an arbitrary (x) of CCD), it is equal to zero outside some domain D, 
so that we have for it, in view of our discussion: 


(m, p) = J my(x) p(x) dx = § m (x) p (a) dx. (223) 
Da D 


The behaviour of m(xz) as x approaches the boundary of D is defined by the 
fact that integral (223) must be convergent for any g(x) of @(D). 
A corollary must be mentioned of the result obtained. Suppose that p(x) is 
a function summable in D (D is a finite domain) and 
J v (a) 4, @ (@) dz = 0 
D 
for any (x) of Qf)(D). 
The functional 


(m,e) = § p(x) p(x) de 
D 


satisfies equation (219), and by what has been said, we can assert that (xr) 
is equivalent to a function harmonic in D. 
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A similar discussion can be given of linear functionals on various 
families of functions. We shall mention an example. Let K be the 
family of real functions g(x), defined throughout space R#,, finite and 
having continuous derivatives of all orders. The family K is a linear 
space. It cannot be normed in the ordinary sense of the word, and 
we introduce the following definition for this space only: 

Derinition. We shall say that the sequence 9;(x) (k = 1, 2, ...) of 
functions of K tends to zero if there exists a bounded domain outside 
which all the p,(x) are zero, and if the y,(x) and all their derivatives tend 
uniformly to zero as k—> co, 

The functional (m, g) is defined in K by associating the real number 
(m, p) with each g(x) € K. Such a functional is said to be linear 
(or linear and continuous), if it is distributive, i.e. (m, c, p, + C2 Q_) = 
== ¢,(m, 9) + ¢,(m, y,), and has the property that (m, p,) > 0 as the 
sequence (x) tends to zero. Functionals of the function type are 
defined by (205), where D is R, and y(z) is any given function, 
summable in any bounded domain. Multiplication of the functional 
by the function w(x), having continuous derivatives of all orders, 
is defined by the equation (w m, y) = (m, wy) and differentiation of 
a functional by 


(D*m, p) = (— 1)*(m, D* 9). 
The functional has derivatives of all orders. The theory of functionals 


in space K is discussed in the above-mentioned work by Gel’fand and 
Shilov. 


CHAPTER V 


HILBERT SPACE 


§ 1. The theory of bounded operators 


120, Axioms of the space. When discussing function space L, and 
sequence space /,, we explained their identity of structure; they are 
realizations of the same abstract space, to the investigation of which 
the present chapter is devoted. It was first introduced by Hilbert 
in the form J, and is generally known as H or Hilbert space. As we 
shall see below, H space is a particular case of a B space, and everything 
said about these latter applies to H space. But apart from all this, 
H space has its own special properties. 

Let us enumerate the axioms defining H. H space is a linear space, 
the elements of which satisfy axiom A of [95]. We shall assume that 
elements of H can be multiplied by complex numbers (a complex 
linear space). 

We shall further assume that, if there is no proviso to the contrary, 
given any positive integer n, there exist n linearly independent 
elements (axiom B of [95]). We now introduce a new axiom, relating 
to scalar products: 

Axiom C. Each pair of elements x and y of H has associated with 
it a definite complex number, which is called the scalar product of x 
and y. It is denoted by (x, y). This scalar product has the following 
properties: 


(y, 2) = (9); (a + 2",y) = (2's 9) + (2, 9) 5 | i 
(x,z%)>0, if x40; (ax, y)=a(z, y). 


We recall that x4 0 means that x is not the zero element. The 
following are immediate corollaries of the above properties: 


(xy +y")=(%, y')+ (2, y"); (%, ay) = a(x, y); 
(t,xz)=0,ifx=0; (x, y)=0, if x or y=0. 
367 
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The expression V(x, 2), in which the value of the root is assumed > 0, 
is called the norm of element x and is denoted by || 2 ||, as in [95]. 
We now show that the norm thus defined satisfies the conditions 
of axiom C of [95]. It follows at once from the definition that || x || > 0, 
where the = sign only holds for the zero element. We have further: 


lax |? = (az, ax) = |a|? (x, x) =|a)?-||a]/?, ie. 


|| ax || =|a@|- [2]. (3) 
Hence it follows that || —x |] = || 2 || [95]. 
It remains to verify that 
Jey] < lel] + Ilyll- (4) 


We must first show that 


{(z, y)| <||e||-|l yl, (5) 


which will in future be called Buniakowski’s inequality [cf. IV; 35]. 
Let 2 and y be any elements of H, and a and b any complex numbers. 
We have 


|| aa + by ||? = (ax + by, ax + by) = aa (x, x) + ab (x, y) + 
+ ab (y, x) + bb (y, y) > 0 


This is a Hermitian positive form (with respect to variables a and 
b), whose discriminant must be non-negative, i.e. 


(~, x) (y,¥)— (a, y)(y,x%)>O0 or |[xl|l?-fylF—l(@% yw PS, 


whence (5) follows. We turn to the proof of (4). Using the obvious 
equation 
(x, y) + (y, £) = 2R (zx, y) (6) 


where @ is the notation for the real part, we get 
|e +yP=(@+ye+y)=llelP + yl? + 2e(z, y), 
whence it follows, in view of | Q(x, y) | < | (x, y) | and inequality (5): 
e+ yl? < fel? + ily? + fet lgy l= Cell + ly i? 


and we arrive at (4). 
As in [95], it follows from (4) that 


, le—yll>lell—igil 
|e—yll<]]z—2{[+|le—gl. (7) 


121) ORTHOGONALITY AND ORTHOGONAL SYSTEMS OF ELEMENTS 369 


The concept of norm leads, as in [95], to the concept of the distance 


between elements z and y: 0(%, y) = || a —y]|, and to the concept 
of the limit of a sequence z, (strong convergence): 


Ly => Xp, if || zy — x, ||->0. (8) 


Everything said earlier about the limit is still valid. We have seen 
[95] that, if a,—> Ap, Ip => Ty and Yn => Yo, then ap Tz => Ay X_ and 
Ln + Yn => Ly + Yo. We shall now prove a theorem. 

THEOREM. [f 2, => Xp ANd Yn => Yo, then (Ln, Yn) —> (Lor Yo). We put 
Un = Ly — Land , = Yn — Yo. By hypothesis, || wp, || and || en || > 
—»> 0. We have 


(%o» Yo) — (Lay Yn) = (%os Yo) — (%o + Uns Yo + Yn) = 
= — (Xp, Un) — (Uns Yo) — (Uns On) 
and we can write, on applying (5): 
| (Xo Yo) — (®n Yn) | < [Roll fenll + fall Iyoll + Ul Mall ll nll» 


whence it follows, since |] w,, || and || v, |] > 0 that (ap, yn) > (Xo. Yo)s 

This gives us, when 2, = Yn: if %,=> 2, then || a ||— || Zo || 
[ef. 95]. 

If the sequence 2, has a limiting element, it is mutually conver- 
gent, ie. || %, — %m || > 0 as m and m— oo [95]. We shall assume 
that H space is complete. 

Axtom D. If a sequence x, ts mutually convergent, there exists an 
element x, of H such that x, => x). In addition, we shall take the 
following axiom: 

Axiom E. H space is separable. In other words, there exists a 
denumerable set of elements of H which is dense in ZH. 

It follows at once from what has been said that H is a B space. 


121. Orthogonality and orthogonal systems of elements. If (x, y) = 
= 0, then (y, z) = 0, by (1), and the elements x and y are now said 
to be mutually orthogonal, or simply orthogonal, and we write 
x | y. By (2), the zero element is orthogonal to any element. 

Let 2, %, ...,%m be mutually orthogonal elements, i.e. (%p), 24) = 
= 0 for p#q. Let us take the square of the norm of the sum of 
these elements: 


\[ ay toy eb Dpnll® = (ey + ty +... +, Sy + H+... + Bm)- 
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On expanding the scalar product in accordance with (1) and (2) 
and using the orthogonality, we obtain the following Pythagoras’ 
theorem for mutually orthogonal elements: 


Il ty + et... + %m ||? = |] ey [P+ |] eel? +---+ emil?- (9) 


As in [95], we can use the concept of limit to bring in the idea 
of the convergence of an infinite series formed from elements of H: 


Uy + Uy + Ug... (10) 

Such a series is called convergent if the sum of its first m terms: 

Sy = Uy + Uy +... + up, tends to a limit: 3, u as n> 7. The 

element u is called the sum of series (10). The necessary and sufficient 

condition for the convergence of series (10) follows at once from the 

axiom of completeness and what we have said regarding mutual 
convergence: given any ¢ > 0, there exists an N such that 


| nti t+ Unga t +--+ Ung,!] <e for n>N andp>i.  (10,) 


This convergence condition has a particularly simple form when 
the terms of series (10) are orthogonal to each other, i.e. (Up, Ug) = 0 
for p # q. 
THEorEeM. If the terms of series (10) are orthogonal to each other, 
the necessary and sufficient condition for its convergence is the con- 
vergence of the following series of non-negative numbers: 


> || % I’. et) 
k=1 


For, by Pythagoras’ theorem, condition (10,) can be written in 
this case as 


ll naa |? + |] nae ll? +--+ ||tnap ll? <¢ for x»>N and p>, 
and this latter condition is necessary and sufficient for the convergence 
of series (11). 

Whatever the rearrangement of the terms of series (10), the terms 
of series (11) undergo the same rearrangement. But this has no effect 
on its convergence. Consequently, rearrangement of the terms of 
series (10) does not affect its convergence — if it is convergent, it 
remains so after rearrangement; if it is not convergent, it cannot 
become so after rearrangement. By using Pythagoras’ theorem and 
the convergence of series (11), it is easily shown that the sum of the 
series does not depend on the order of the terms. 

We say that a sequence of elements 


Bie Woy Bae cas (12) 
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forms an orthogonal normalized (orthonormal) system if 


0 for p#q, (13) 


Ly Ly) = 
ay ae eee 


In view of the above theorem, we can assert that the necessary and 
sufficient condition for convergence of 


D> 42, (14) 
k=l 

is that the following series of non-negative numbers be convergent: 
PA | a, 
k=l 


Suppose that this condition is satisfied, and let x denote the sum 
of series (14). We form the scalar product 


nr 
(2 AyLy, X p| - 
k=1 


By (13), it is equal to a, when n > p, so that we obtain on passing 
to the limit as n> oo: 


2 (15) 


a, = (x, Xp). (16) 


The numbers a, defined by this formula are called the Fourier 
coefficients of the element x with respect to system (12), whilst 
series (14) is the Fourier series of the element x. We obviously have 


n mn 
[|e Say 2 = |jalP— Sa, |, (17) 
k=1 k=l 
and on indefinite increase of n, we get the closure equation 


leit = Slee. (18) 


It follows from the above discussion that, if series (14) is convergent, 
it is the Fourier series for its own sum 2, and the closure equation 
(18) holds. Now suppose conversely that some element x of H is 
given. Let us form its Fourier coefficients (16) and write formula (17). 
This leads us to Bessel’s inequality 


*. (19) 
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The series on the left is necessarily convergent, ie. the Fourier 
series of any element z is necessarily convergent. If the = sign holds 
in (19), it means, by (17), that the sum of the Fourier series of the 
element 2 is precisely equal to this element 7. System (12) is said 
to be closed if the = sign holds in (19) for any element z of H. System 
(12) is said to be complete if there exists no element of H, apart 
from zero, which is orthogonal to all the z,. It can be shown, precisely 
as in [58], that closure and completeness are equivalent. If system 
(12) is closed, every element x of H is uniquely expressible as a con- 
vergent series (14), i.e. as its Fourier series. Let a, be the Fourier 
coefficients of the element x and 0, those of element y. If system (12) 
is closed, we obtain a generalized closure equation as in [58]: 


(t, ¥) = > aby. (18) 
k=l 


Notice also that, if c, are any complex numbers and a, are the 
Fourier coefficients of an element x, the formula holds [ef. 58]: 


n n nm 
Le — Sete |? = lle? — Slee P + Sex — ax/?- 
k=l k=l k=l 


A comparison with (17) shows us that the left-hand side of the 
last formula takes its least value when the c; are the Fourier coefficients 
of 2. 

Notice that, if H space is taken in the concrete form of function 
space L,, convergence in H will correspond to convergence in the 
mean in L,, which we discussed in [56]. Convergence of the Fourier 
series here leads to the closure equation of [58]. 

Let us now recall an orthogonalization process that we have already 
employed in the case of n-dimensional complex space [III; 29]. 
Let 

21) 2 2g) --- (20) 


be an infinite sequence of non-zero elements of H. 

We form the normalized element x, = 2, / || 2, ||. Let z, be the first 
of the elements of (20) after z, that cannot be written in the form 
a,x, We form the element y, = 2, — (2%, 2,)%, Which certainly 
differs from zero, then we normalize it, i.e. form 2, = y,/ || y¥,||. Further, 
let z, be the first of elements (20) after z, that cannot be written 
in the form a, 2, + 4,2,. We form the element 


Ys = % — (2, Ly) Ly — (%, Lg) Las 
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which certainly differs from zero, then normalize it, ie. form 
Ls == Y3/|| ¥3 ||. By proceeding in this way, we get the orthogonal 
and normalized system (12), which has the following property: every 
element 2, is a finite linear combination of elements (20), and vice 
versa, an element 2, being expressible in terms of only the first k 
elements y,. Notice that the mutually orthogonal, non-zero elements 
ys (s = 1, 2,..., m) are linearly independent. For, suppose we had 


C14, + C2Yo Frees + CmYm = 9. 


On multiplying both sides by y, and taking the orthogonality into 
account, we get cy, || Yx |/* = 0, ie. c, = O (kK = 1, 2,...,m), whence 
follows the linear independence of the y,. 

Since H is separable, there exists a denumerable set Mf of elements 


Vis Uy Magy ees (21) 


dense in H. If we orthogonalize the sequence u,, we get a complete 
(closed) orthonormal system a2, (k= 1,2,...), consisting of a 
denumerable set of elements. The closure of the system follows at 
once from the fact that set (21) is everywhere dense in H. If only 
a finite number of elements remained after orthogonalization, H 
would be finite-dimensional. 

Conversely, if there exists in H a complete orthonormal system 
x, (k = 1, 2,...), consisting of a denumerable set of elements, it is 
easily shown that the finite sums c, 2%, + ¢,% + ... + c, 2, with 
complex rational coefficients c, (cs = a; + bi, where a, and b, are 
real rational numbers) form a denumerable set, dense in H, i.e. the 
separability of H is equivalent to the existence in H of a complete 
orthonormal system, representing a denumerable set of elements. 

Let us prove a further consequence of the separability of H, viz. 
every orthonormal system Z(v) consists of a finite or denumerable 
set of elements. 

Let x and y be two mutually orthogonal and normalized elements, i.e. 
(x,y) =0 and ||z||=||y||/=1. We have ||e—y|P=(e@—y, 
z—y)=2 or ||z—y||= 2, ie. the distance between two orthogonal 
and normalized elements is equal to 2. Now suppose that &(v) is a set 
of orthonormal elements. We fix ¢ so that 0 < ¢ < 1//2. Given 
any v of Z(v), there exists an element wu; of set (21), dense in H, such 
that || w, — v || < e. On the other hand, if k is fixed, only one element 
of S(v) can satisfy || u,—wv|| < e¢, since if there were two distinct 
elements v, and v, satisfying this inequality, we should have by the 
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triangle rule: || v, — », || < 2e < 2, whilst we must have || », — v, ||= 
= V2. It follows at once from what has been said that the set F(v) 
is finite or denumerable. 


122, Projections. The concepts of lineal and subspace [95] will play 
an essential part in what follows. 

A set of elements belonging to any fixed subspace L satisfies all 
the axioms enumerated above, excepting possibly axiom JB, since 
the subspace may be finite-dimensional. Thus every infinite-dimen- 
sional subspace can be regarded as an independent complex Hilbert 
space. This remark is quite obvious as regards all the axioms excepting 
the axiom of separability. We have to prove the following as regards 
this latter: if H is separable, every subspace L is a separable Hilbert 
space. The proof of this presents no difficulty [94]. 

Two subspaces L and M&M are said to be mutually orthogonal if 
any element of Z is orthogonal to any element of M. We write this 
as L | M. An element z is said to be orthogonal to a subspace LZ 
if it is orthogonal to any element of L. We write x | L. Let us now 
prove a theorem, vital for what follows. 

THroreM. If L is a subspace, any element x of H can be written as 


r=y+z, (22) 
where y € L and z | L. Form (22) is unique. 

If € L, we get form (22) by putting = 2-+ 0. Now suppose 
that x does not belong to LZ. Let d be the strict lower bound of the 
set of positive numbers || x — y||?, for y belonging to subspace L: 

d= inf j|v—y|f?. (23) 
yEeL 

There exists a sequence of elements y, belonging to LZ such that 

(t — Yn, © — Yn) = ||@& —yn |? =n —> 4. (24) 

Let wu be any element of L. Since L is a subspace, given any real 
(or even complex) ¢, the elements y, -+ eu belong to L, and, in view 
of (23), we can write (x ~ Yn, —- eu, © — Y, — eu) > d. On expand- 
ing the scalar product, we get 

(u, u)e2— 2R (x — y,, u)e+ (d, —d) > 0. 

The quadratic form on the left is non-negative for any real «, so 

that we must have 
|R(u, &—Yn)| < Vp — || mI. (25) 
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We shall strengthen this inequality. Let » be the amplitude of 


the complex number (u, % — yp), i.e. (uw, 2 — Yn) = | (wu, © — Yn) | e'?, 

We replace the element w by ue-"? in (25), where the latter element 

also belongs to ZL. On observing that || we-'? || = || w{| and that 
(we-?, Yo Yn) =e (u, xt — Yn) =r | (u, 2 — Yn) | ’ 


we obtain from (25) the stricter inequality 
|(u, *— yn)| < Vd, — |e]. (26) 


Remember that the z in this inequality is a given element of H, 
Yn is a sequence of elements of L satisfying condition (24), and wu 
is any element of Z. Let us now find an upper bound of she scalar 
product (u, Yn — Ym). On writing Yn — Ym = (Yn — &) + (& — Yn), 
and using (26), we get 


|(%s Yn — Ym) | < | (%, 2 — Yn) | + | (te, & — Ym) | < 


< (/d,—4 + Vd, — d) || w |]. 
On putting w= y, — Ym in this inequality and cancelling || yn, — 
— Ym || in both sides, we arrive at 


Yn — Ym || < Vd,—d+ Vd, — d. 

Notice that this inequality is obvious if || y, — ym || = 0. By (23), 
its right-hand side tends to zero on indefinite increase of m and n, 
so that the sequence of elements y, is mutually convergent. By the 
axiom of completeness, there exists an element y such that y,>y, 
and y € L, since L is a subspace. On the other hand, on passing to 
the limit in (26), we obtain for any element u of L: (u, x — y) = 0, 
ie.  — y is orthogonal to L. On putting x — y = z, we in fact get 
(22), in which y € Z and z | L. It remains to prove that the form 
(22) obtained is unique. Let there be two forms: 


eR|YteHY+%y, 


where y and y, € L and z and z, | L. We obviously have y — y, = 
= 2, — 2. The left-hand side of this equation is an element of L, 
and the right-hand side an element orthogonal to L. Hence it follows 
that (y— yy, ¥— 4%) = O, le. |ly— ¥, || = 0, 80 that ¥, = y, ie. 
2, = 2. The theorem is fully proved. 

The element y of Z in (22) is called the projection of the element 
x on the subspace L. A set of elements orthogonal to a subspace LZ 
obviously forms a subspace. We write this as M. By the above 
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theorem, each element x of can be expressed uniquely as the sum 
of two elements, one of which belongs to LZ, and the other to M. The 
set of elements orthogonal to M forms the subspace Z. The relation- 
ship between subspaces Z and M is mutual in this respect, and two 
such subspaces are described as complementary. In the case of real 
three-dimensional space, e.g. the sets of vectors forming the XY 
plane and the Z axis are complementary spaces. 
We usually write in the above case: 


H=LOM (27) 
or 
L=HOM: M=HOL, (28) 
so that H© M is the subspace of elements of H orthogonal to sub- 
space M. 


123. Linear functionals. We have had the definition of linear 
functional /(z) in a B space and hence in H. We shall assume that 
it is defined throughout H. We recall that its norm, which we shall 
denote by 7,, is defined by 

m= sup 10 (0) | (29) 
x||=1 
and 
|Z (x) | < 2 || x]. (30) 

Let us give an example of a linear functional. Let y be a fixed 

element of H. We put 


I(x) = (x,y). (31) 
It is distributive by virtue of (1), and bounded by virtue of (5): 
[2(%)|< || yl ll- (32) 


Notice that the = sign holds in this inequality when x = y, ice. 
we cannot replace || ¥ || by a smaller factor, and || y |] is the norm 
of functional (31). If y is the zero element, then (z, y) = 0 for any a 
(the annihilation functional). 

As a matter of fact, (31) gives all the possible functionals in Z, 
ie. the following important theorem holds: 

THEOREM. Hvery linear functional I(x) is uniquely expressible by (29), 
where y ts a fixed element of H. 

Since the functional is distributive, we have 1(@) = 0, where 6 
is the zero element of H. Let L be the set of all elements x for which 
I(x) = 0. Since U(x) is distributive and continuous, LZ is a subspace. 
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It may happen that Z is the complete space H, i. that l(z) = 0 
for any element x. We can obviously write such a functional in the 
form I(x) = (x, 0). We now take the general case, when the subspace 
L is part of H. Let z be a fixed element of H not belonging to L. 
We can write it as z=u-+v, where u€ LZ and v | L, and v#0 
[122]. Since » doesnot belong to L, we have: I(v) 4 0. Let z be any 
element of H. We form the element w = x — [I(zx){l(v)]v and con- 


sider I(w): 
aL 


U(w) = U(x (v) = U(%) — 1 (2) = 


We see from this that the element w = x — [I(x)/I(v)] v belongs to L, 
whilst v | Z as seen above. We can therefore write 


or, on expanding the scalar product: 
l (a) _ 
(x, ») — ro Ile iP =0, 
whence it follows that we can write I(x) as the scalar product 
= T(v) LQ) 
tig) = (x, Po Terr »| = (z,y), where y = Tout? 

It remains to show that the scalar product form of I(x) is unique. 
Let U(x) = (x, y) = (a, y,). Hence (x, y — y,) = 0 for any @ of Z. 
On putting z= y — y, we get || y¥ — y¥, || = 0, ie. ¥, = y, and the 
proof is complete. 

The functional defined above is sometimes called a linear functional 
of the first kind. A linear functional of the second kind is then defined 
as a bounded functional with the property: 


Ly (€4% + Cg%_ +... + On %m) = 
= 0,1, (e,) + Cg), (2g) +--+ Cmts (%m)s 
i.e. constant factors are transformed into the conjugate complex 
numbers when taken outside the functional sign. An example of a 
linear functional of the second kind is provided by the scalar product 
in which the variable element z is in the second place, and the fixed 
element y in the first: 


1, (x) = (y, 2). (31,) 


If 1,(z) is a linear functional of the second kind, U(x) = 1,(z) is a 
linear functional of the first kind. It follows at once from this remark 
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and the theorem that (31,) gives the general form of a linear functional 
of the second kind. 

It also follows from the theorem that every linear functional I(x) 
is completely defined by the element y of H, i.e. the space H*, con- 
jugate to H, is H. We recall further that, if U(x) is a distributive 
bounded functional I(x) on a lineal L,, everywhere dense in H, it 
can be uniquely extended on to the whole of H in such a way that 
it is linear (bounded) on H with the same norm as in L, [97]. 


124. Linear operators. We now consider linear (bounded) operators, 
defined throughout H, the values of which also belong to H. In future, 
unless there is a special proviso, we shall use the terms ‘linear 
functional’ and “‘linear operator’ for distributive bounded functionals 
and operators, given throughout H [97,98]. We shall write || A || 
or m4 for the norm of the operator A. We recall the formula 


tee eee (33) 


We shall use H to denote the operator of the identity transformation, 
ie. Ex = x for any x € H (|| #}| = 1). If || A || = 0, A is the annihi- 
lation operator, i.e. Ax = 0 for any x € H. Let L be a subspace. 
By the theorem of [122], given any x ¢€ H, we have the unique form 
x=y-+z2, where y€ 2 and z{ UL. The operator transforming x 
into y is called the projection operator or projector into L, and is 
denoted by 

y= P,x. (34) 


If Lis the whole of H, P; = E. If L consists of a single zero element, 
P, is the annihilation operator. In the general case || Piz || < || x ||, 
where the = sign holds when and only when zx € L. lf P, is not the 
annihilation operator, || Pz || = 1. P, is distributive: for, if we have 
two expansions: 2, = y, + 2, and 2 = y+ 2, where y, and y, € L, 
whilst z, and z, 1 LZ, then 2, + 2 = (y, + y%) + (% + 2%), where 
y, + y, € LD, and 2,+ 2 1 L,ie. Py(x, + x) = Pp x, + Pr %. Simil- 
arly, P,(az) = aP,(z). 

Let us now introduce some new concepts and discuss the elementary 
properties of linear operators [cf. 97]. We shall often omit the word 
“linear”’. 

If A and B are two operators such that Ax = Bz for any element 2, 
we say that A and B coincide and write A = B. If the distributive 
and bounded operator A is given on some lineal L,, everywhere 
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dense in H, it can be uniquely extended to the whole of H whilst 
remaining distributive and bounded, just as in the case of a functional, 
whilst its norm remains not greater than the original norm in J). 

If A and B are two operators, and a, 6 two complex numbers, 
aA + bB is a linear operator, defined by 


(aA + bB)x =aAx + bBz. (35) 
On taking into account that 
|| ade + bBax|| < ja|-|| Axl] + |b]- || Bel] < (alm, + [| mp) || |], 


we see that the norm of aA + bB is < ]a|n,+ || ng. Operators 
can thus be multiplied by complex numbers and added. This operation 
is subject to the usual laws of algebra. Successive application of 
operators A and B is again a linear operator, which we write symbol- 
ically as BA. Application of the same operators, but in the opposite 
order, leads to a linear operator AB, which in general differs from BA. 
We call BA and AB the products of operators A and B. This definition 
can be extended immediately to the case of any finite number of 
factors. If AB = BA, the operators are said to commute. We have 


|| BAa|| < ngl| Av] < mgr, |||) 


so that the norms of BA and AB are < ngny. Notice also that, 
if a@ is a complex number and 4 an operator, the norm of aA is pre- 
cisely equal to |a@|n,. A product of operators is obviously subject 
to the associative law, i.e. 


and to the distributive law: 
(4+ B)C = AC+ BC and C(A+ B)=CA+CB. 


We now introduce conjugate (adjoint) operators. Let A be a linear 
operator; we consider the scalar product (Az, y). Given any fixed 
element y, it is a functional of z. It is distributive, since the scalar 
product is distributive, and it is bounded by virtue of the obvious 
formula 

| (Ax, y)|<n,lly|l 2]. 


But every functional can be uniquely expressed as a scalar product. 
Thus, given any fixed element y, there exists a definite element y* 
such that 

(Ax, y) = (2, y*) (36) 
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for any x of H. This formula therefore gives a definite law by which 
a definite element y* corresponds to each element y. We write this 
law as y* = A* y, where A* is the symbol of some operator. It is 
distributive because the scalar products (Az, y) and (2, y*) are 
distributive with respect to the second argument. Further, A* is 
bounded, as will be shown below. This operator A* is called the con- 
jugate to A. We can now write (36) as 


(Az, y) = (@, A* y). (37) 


The following formulae for the conjugate to a sum and product 
of operators are immediate consequences of the definition: 


(aA)* = a@A*; (A + B)* = A* + B*; (AB)* = B*A*; (A*)* = A. (38) 


Let us prove say the third formula. On twice applying definition 
(37), we obtain 
(ABz, y= (Ba, Ae y= (x, B*A*y), 
whence it follows that (AB)* = B* A*. Let us also prove the last of 
formulae (38). Using definition (37) and the property (u, v) = (v, u), 
we have 


(A* x,y) = (y, A* x) = (Ay, 2) = (2, Ay), 


whence it follows that (A*)* = A. Finally, let us show that A®* is 
bounded. 

THEOREM 1. The norm of the conjugate operator is equal to the norm 
of the original operator, i.e. nas = Na. 

On putting z = A*y in (37) and using inequalities (5) and (33), 
we obtain 

At yl =|(A (499) |< Ala* yl liyll < mall Py lig 


whence it follows that || A* y || < na || y ||, so that m4» < n,. Since 
(A*)* = A, we have, by what has been proved: nag < Nag-, i.e. 24 = 
= Tas. 
An operator A is said to be self-conjugate if A* = A. It is therefore 
characteristic of a self-conjugate operator that 
(Ax, y) = (2, Ay). (39) 


If we put y = 2 in this equation and note that (Az, x) = (z, Az), 
it will be seen that (Az, x) is real for any element x in the case of a 
self-conjugate operator. The converse is also true. 

THEOREM 2. The necessary and sufficient condition for A to be self- 
conjugate ts that (Az, x) be real for any element =. 
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We have proved the necessity. Now suppose that (Az, z) is real 
for any choice of x, and let us show that A is self-conjugate. We have 
by hypothesis: 


(A(e+y)e+y)=(«+y,A(v+y)) 
and (A (x + iy), % + ty) = (w+ ty, A(x + ty)). 


On expanding the scalar products and noticing that (Az, 7) = 
= (x, Ax) and (Ay, y) = (y, Ay), we obtain 


(Ay, x) + (Ax, y) = (y, Ax) +- (x, Ay); 
(Ay, x) — (Ax, y) = (y, Ax) — (a, Ay). 


Term by term subtraction gives us (39), whence it follows that A 
is self-conjugate. On recalling (38), we see that any linear combination 
a,A, + a,4, + ... + a@mAm of self-conjugate operators A, with real 
coefficients is a self-conjugate operator, whilst the product AB of 
self-conjugate operators is self-conjugate when and only when A and 
B commute. 
Let LZ be a subspace and M its complement. We have by the 
theorem of [122]: 
E=P,+ Py. (40) 


It may easily be seen that every projector P, is a self-conjugate 
operator. For, by the orthogonality of Z and M, and (40), we have 
(Pix, y) = (Py, Pry + Pyy) = (Pye, Pry) = 

= (P,x+ Pyy, Pry) = (&, Pry). 


Let A be any linear operator. We form the two operators: 
1 1 
A, = 35 (A+ A*); A, = 5; (4 — A®). (41) 


It may be seen, in view of (38), that A, and A, are self-conjugate. 
We thus have the following expression for any linear operator in 
terms of self-conjugate operators: A = A, + 7A). 


125. Bilinear and quadratic functionals. We now show that it is 
possible to define any linear operator with the aid of a special type 
of functional. A bilinear functional is a definite law by which any 
pair of elements z and y of H is associated with a definite complex 
number Uz, y), where l(z, y) is distributive with respect to the first 
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argument as a functional of the first kind and with respect to the 
second argument as a functional of the second kind: 


T(ax, + bx,, y) = al (x,, y) + 61 (x,y); 
L(x, ay, + by,) = al (a, y,) + OL (a, yp). (42) 


We assume in addition that a bilinear functional is bounded, 
i.e. we assume that there exists a positive number N such that, 
given any elements x and y of H, we have 


[A(z y)|<N |e |]- ly. (43) 


The least value of NV in this inequality — the norm of the bilinear 
functional n, — is defined by 


m = sup |l (zx, y)|. (44) 
I[x]}=1 
ly i]=1 
If A is any linear operator, the formula 
L(x, y) = (Az, y) (45) 


may easily be seen to yield a bilinear functional. Here, 


[F(x y)|<nallell-[yll 


so that n,; < n,4 for the bilinear functional (45). 

We now show that (45) gives all possible bilinear functionals. 

TaEeorem. Every bilinear functional is uniquely expressible by (45), 
where A is a linear operator, and the norm of the bilinear functional n, 
is equal to the norm of the operator na. 

If we fix 2, I(x, y) is a functional of the second kind in y, and we 
can write [123]: U(x, y) = (2, y), where z is uniquely defined if we 
fix x, ie. 2 == Az, where A is an operator defined throughout Z. 
Its distributiveness is a direct consequence of (42) and the distribut- 
iveness of (z, y) with respect to z. Let us show that A is bounded. 
Using (43) with N = n,, we can write 


| (Ax, y)| <m|/e|-{y|. 

On putting y = Ax and cancelling both sides of the inequality 
obtained by || Az ||, we get || Az || <n, || 2 || (if || Av || = 0, the 
last inequality is obvious). Hence it follows that A is bounded and 
that n4 < n;. But we know from above that 2; < n,, so that n; = ny. 

It remains to prove that form (45) is unique. Let 


L(x, y) = (Ax, y) = (A,%, y). 
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It follows from this that (Az — A,z, y) =0 for any x and y. 
On putting y = Ax — A,z in this equation, we get || dz — A,z || = 0, 
ie. we have A,x = Az for any 2, ie. operators A and A, coincide, 
which completes the proof. It follows from this theorem that specify- 
ing a linear operator is equivalent to specifying a bilinear functional. 
Similarly, in algebra, specifying the elements a;, of a matrix is equi- 
valent to specifying the bilinear form 


rn _— 
> BK Yi- 
i, k=l 
Every bilinear functional I(x, y) generates a corresponding quadratic 
functional (quadratic form), if we put y = @ in it: 


bie, 2) = (Aa, 2). 


A bilinear functional is readily expressed in terms of the quadratic 
form generated by it, i.e. it can easily be shown that 


(Ax, y) = [(Ax,, %) — (Ax, ,)] + t[(Axg, %3) — (Amy, %)], (46) 


where 


mas(ety); m= >(e—y); 


(47) 
tye=Z(e+ty); %4=+ (x — iy). 


Four quadratic functionals appear on the right-hand side of (46). 
The fact that the quadratic functional (Az, xz)is real for any element x 
is a characteristic feature of a self-conjugate operator, as we have 
seen. 

Suppose that A has the property that (Az, z) = 0 for any element 
x. It now follows from (46) that (Az, y) = 0 for any x and y. But the 
bilinear functional (Az, y) obviously has this property if A is the 
annihilation operator, and we can say in view of the uniqueness 
indicated in the theorem that, if (Az, 2) =0 for any 2, A is the 
annihilation operator. It follows at once from this that,if A and Bare 
such that (Az, 2) = (Bx, x) for any az, then A = B. 


126. Bounds of a self-conjugate operator. Let A be a self-conjugate 
operator. Using (5) and (33), we can write | (Az, 2x) | < ny || 2 ||?, 
or, if we take || z || = 1, we get | (Az, x) | < n,. Hence, if we take 
all possible normalized elements 2, i.e. such that ||x|| = 1, the set of 
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real numbers (Az, x) is bounded from below and above. Let m denote 
the strict lower bound and M&M the strict upper bound of this set: 


m = inf (Axv,x); M = sup (Az, 2). (48) 

\|xI]=1 |}x}]=1 
The numbers m and M&M are usually termed the bounds of the self- 
conjugate operator. We can write, by definition of the strict bounds: 


m<(Azx,z)<M for |jz|/=1. (49) 


Constant ivctors can be taken outside the scalar product, so that 
we can write for an element x with any norm: 


m || x ||? < (Ax, x) << M||z||?. (50) 


As a matter of fact, the norm of the operator n, is very simply 
expressed in terms of its bounds m and Jf, in accordance with the 
following theorem: 

THEOREM 1. The norm na is equal to the greater of the two numbers 
|m| and | M |. 

The proof is a word-for-word repetition of the proof of Theorem 3 
of [IV; 36, 39], in which it is shown that n, = sup | (Az, 2) |, 


||[x|{=1 
which is the same as the statement of the theorem. Notice also that 


Theorem 2 of [IV; 36] is the same as the assertion that n,; = na, 
which was proved in the previous section. 

Let us introduce some new concepts. 

DeEFinirion. A self-conjugate operator A is said to be positive 
(negative ) if the corresponding quadratic functional (Ax, x) > 0 (< 0). It 
is characteristic of a positive operator that m > 0, i.e. that its lower 
bound is non-negative. We can also talk of a self-conjugate operator A 
being greater than a self-conjugate operator B, and write A > B, 
if A and B do not coincide and the difference A — B is a positive 
operator. A negative operator is similarly defined. It may be recalled 
that, in the case of n-dimensional space, a self-conjugate matrix is 
called positive if the corresponding Hermitian form 


Din % Xp, (Ay; = Gi) 


takes only non-negative values. To say that a matrix is positive is 
equivalent to saying that it has no negative eigenvalues. If (Az, 2) 
changes sign for different 2z, the self-conjugate operator A can 
evidently not be called either positive or negative. In the case of 
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finite-dimensional space, these A are the self-conjugate matrices 
whose eigenvalues have different signs. 

THEoREM 2. If A is a self-conjugate operator, A* is a positive operator. 
If A is any linear operator, the operators AA* and A* A are self-conjugate 
and positive. 

The first assertion follows at once from 


(A? 2, x) = (Ax, Ax) =|| Ax |l? > 0, 
and the second from 
(AA* x, x) = (A* x, A* x) =|| A* ||? > 0, (51) 
(A* Aw, x) = (Ax, Ax) = || Ax |? > 0. (52) 


127. The inverse operator. An important concept in the theory of 
operators is that of the inverse operator (cf. the concept of inverse 
matrix in IIIJ,). Various definitions can be given of the inverse operator, 
and will be described in the present section. 

As in the previous section, we shall here describe a distributive, 
bounded operator, given throughout H, as linear. 

DEFINITION. A linear operator A is said to have a bounded inverse B 
if B is a bounded operator defined throughout H, and 


AB=BA=E, (53) 


where E is the operator of the identity transformation. The fact 
that B is bounded is described by the usual inequality || Bz || < 
< N || a ||. It may easily be seen that there can only be one bounded 
inverse operator. For, if we have AC = E, on multiplying by B on 
the left, using (53) and recalling that BE = B and EC =(C, we get 
B=C. The operator B defined above is usually written as A-}, 

and we have 
AA1=AIA=E. (54) 

Suppose we have 

y= Ax (x€H). (55) 


Since A-1 is defined throughout H, we can apply A to both 


sides and obtain 
a= Aly, (56) 


It is evident from this that, if A has a bounded inverse A-}, A 
performs a one-to-one mapping of space H into itself, i.e. a de- 
finite element y corresponds, in accordance with (55), to any element 
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x € H, and conversely, a definite element 2, given by (56), corresponds 
to any element y¢€ H. Similarly, A-? maps H one-to-one into 
itself. Since A is distributive, 4-1 must be distributive, i.e. A7} is a 
linear operator. It follows at once from (54) that 


(A-1)-1 = A, (57) 


A more general definition of inverse operator can be given. We 
notice first of all that, since a linear operator is distributive, the set 
of elements y given by (55) is a lineal, which we denote by R(A). 
Let us now consider the property that A must have in order for the 
correspondence between elements xz of H and elements y of R(A) to 
be one-to-one. By (55), a definite element y of (A) corresponds to 
any x of H. We have to show that, conversely, a definite element 
x of H corresponds to any element y of R(A). Let 2, and x, be two 
distinct elements of H, and y,, y, the corresponding elements of R(A): 


W= Ary, Y= AR. 
Subtraction gives 
Y2 — Yy = A(X, — &). 


If we had y, = y,, i.e. the same element of R(A) corresponded to 
distinct elements z, and z, of H, we should have A(x, — x) = 0, 


i.e. the equation 
Az =0 (58) 


must have a solution different from the zero element. Conversely, 
if (58) has a non-zero solution x), the same element y = 0 corresponds 
to distinct elements «=z, and «= 0. Thus the necessary and 
sufficient condition for (55) to give a one-to-one correspondence 
between elements x of H and elements y of #(A) is that (58) have 
only a zero solution. An inverse B to operator A is now defined on 
the lineal R(A). It transforms an element y of R(A) into an element 
x of H such that y is expressed in terms of x by (55). We shall call 
this operator simply the inverse, as distinct from the bounded inverse 
which we defined above. The operator B is defined only on the lineal 
f(A), which may in fact not coincide with H, and we can by no 
means assert that B is bounded. But, since A is distributive, we 
can say that B is a distributive operator on the lineal #(A). On using 
the previous notation A-1 for B, we can write A-(Axv) = 2, ifxe¢ H 
and A(A-!z) =a if x € RA). 
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We shall prove later that, if equation (56) has only a zero solution 
and R(A) is the whole of H, the operator B = A- is bounded, i.e. 
A has a bounded inverse [cf. 97]. 

Let operator A have a bounded inverse, and let us pass to the 
conjugate operators in equations (53): 


(A-1)*. A* = A*(A-1)* = E*= E. 


Hence it follows that 
(A-1)* = (A*)-!. (59) 


Formulae (53) require that the bounded operator B be inverse 
to A from the left and the right, and we have simply called it the 
bounded inverse operator in this case. 

Let us now consider bounded inverse operators only from the left 
or only from the right. 

We say that A has a bounded inverse from the left, or simply 
a left-hand inverse, if there exists a linear operator B such that 
BA = E£. Similarly, if AC = LE, C is called a bounded right-hand 
inverse. 

THEOREM 1. If A hasaleft-hand inverse B and a right-hand inverse C, 
there can be only one left-hand and only one right-hand inverse, these 
being coincident, i.e. there exists the bounded inverse A-}. 

By hypothesis, BA = HE and AC =H, whence it follows that 
(BA)C =C and B(AC) = B. The left-hand sides of these equations 
coincide, so that B =C, i.e. every left-hand inverse coincides with 
every right-hand inverse, so that there can only be one left-hand 
inverse and only one right-hand inverse. 

THEOREM 2. If a unique left-hand inverse exists, a right-hand inverse 
also exists. If a unique right-hand inverse exists, a left-hand inverse 
also exists. In both cases both inverses are unique and coincide (by 
Theorem 1). 

Let us prove the first statement. Let A have a unique left-hand 
inverse B, ie. BA = EH. On multiplying from the left by A, we 
get ABA =A or (AB — E)A =0, where the zero on the right- 
hand side denotes the annihilation operator. On adding to both sides 
BA = E, we can write (AB— E+ B)A= HE. But B is the unique 
left-hand inverse by hypothesis, so that AB — E + B= B, whence 
AB = E, ie. B is also a right-hand inverse. 

A further remark: if A has two different left-hand (right-hand) 
inverse operators B and C, it has an infinite set of left-hand inverse 
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operators. For, if BA = # and CA = LE, it is easily seen that the 
operator B+ a(C — B) is a left-hand inverse for any choice of the 
number a: 


(B+aC —aB)A=BA4+aCA—aBA=E+ak—-ak=E. 


It follows from the above results that the following four cases 
are conceivable: 

(I) there exists a unique left- and right-hand inverse; 

(II) no inverses exist, either left- or right-hand; 

(III) there exists an infinite set of left-hand inverses and none 
right-hand; 

(IV) there exists an infinite set of right-hand inverses and none 
left-hand. 

We shall see later that all these cases may be realized. A simple 
theoretical criterion will be given now, with the aid of which these 
cases can be distinguished. We consider the self-conjugate positive 
(non-negative) operators A*A and AA*. The lower bounds of these 
operators, denoted by m(A*A) and m(AA*), are greater than 
or equal to zero [126]. Suppose that there exists at least one left- 
hand inverse: BA = E, and let k > 0 be the norm of B. We have 
|| BAz || = |} a ||, and on the other hand, |} BAz|| < &|| Az ||, 
whence it follows that k|| Ax || > || a|| and || Az || > 1/k || z||. 
Now (A* Az, x) > (1/k*) || x |[?, so that m(A*A) > 1/K?, i.e. m(A*A) > 
> 0. We now show that, conversely, if m(A*A) > 0, A has a bounded 
left-hand inverse. We shall prove below that, if the lower bound 
of a self-conjugate operator F is positive, F must have a bounded 
inverse [129]. On applying this to # = A*A, we see that there 
exists a bounded operator D such that DA*A = E, i.e. (DA*)A = E, 
whence it follows that DA* is a bounded left-hand inverse for A. 
Similarly, the necessary and sufficient condition for the existence of 
at least one bounded right-hand inverse is that m(AA*) > 0. 

It follows at once from these arguments that the necessary and 
sufficient conditions for the realization of the above four cases are: 


I. m(A*A)>0 and m(AA*)>0; 
II. m(A*A)=0 and m(AA*)=0; 
III. m(A*A)>0 and m(AA*) =0; 
IV. m(A*A)=0 and m(AA*)>0. 
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Notice that, if A commutes with A* (say A = A*, ie. A is self- 
conjugate), cases III and IV cannot hold. On using (51) and (52), 
we can state the result in the first case as follows: the necessary 
and sufficient condition for the existence of a left- and right- 
hand inverse operator is that there exist a positive number / such 
that || Ax || >Z||2{| and || A*x]| >72||z]| for any element z. 
We have not made use of the distributiveness of B and C in any of 
the above. The only important point is that they be defined and 
bounded in the whole of H. In case I, the unique left- and right- 
hand inverse is necessarily a linear operator, as we have already 
seen. In case III there exists a linear operator B = DA*, inverse 
from the left, and similarly in case IV. We shall subsequently be 
concerned with an inverse distributive operator A-1, defined on 
R(A). Notice that, if BA = E, then A* B* = E£, ice. if case [II applies 
for A, case IV applies for A*. 

Inverse operators play a fundamental role when solving the equation 
Az = y, where y is the given and x the required element. If there 
exists a left-hand inverse B, multiplication of both sides by B 
gives us the equation zx = By, i.e. when the left-hand inverse exists, 
the solution, if there is one, is expressible as x = By, and is therefore 
unique. If a right-hand inverse C exists, the equation Az = y is 
obviously satisfied by substituting z = Cy, i.e. the existence of the 
right-hand inverse guarantees the existence of the solution x = Cy. 


128. Spectrum of an operator. There are two fundamental problems 
in the application of the theory of operators to mathematical analysis: 
solution of the homogeneous equation 


Az=ix (60), ie. (A —AE)x =0, (60,) 
and of the non-homogeneous equation 


Az=dx+y, (61), ie. (A —AB)x=y, (61,) 


where « is the required and y the given element, and A is a numerical 
parameter. We call A an eigenvalue of the operator A if (60) has 
solutions differing from the zero element; these solutions are called 
the eigenelements of operator A, corresponding to the eigenvalue. 
If A is an eigenvalue of A, and we associate the zero element with 
the corresponding eigenelements (the zero element satisfies (60) for 
any A), we can say, since equation (61) is linear and homogeneous, 
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and operator A is continuous, that our set of eigenelements (including 
the zero) forms a subspace. We shall call it the subspace of eigen- 
elements corresponding to the eigenvalue. 

If this subspace has a finite number of dimensions 7, i.e. if the 
maximum number of linearly independent elements belonging to the 
subspace is equal to the finite number 7, we say that the correspond- 
ing eigenvalue A has rank or multiplicity 7. If the subspace of eigen- 
elements is infinite-dimensional, we say that the rank of the eigen- 
value is equal to infinity. The following theorem holds in the case 
of a self-conjugate operator A: 

THEOREM 1. The eigenvalues of a self-conjugate operator are real 
and eigenelements corresponding to different eigenvalues are mutually 
orthogonal. 

Let A be an eigenvalue of the self-conjugate operator A and x a 
corresponding eigenelement (non-zero). On multiplying both sides of 
(60,) by x from the right, we get 


(Az, x) =Alja||?. 


Since A is self-conjugate, the left-hand side is real, so that / is 
real. Let A’ and 4” be two distinct eigenvalues, and x’, x” correspond- 
ing eigenelements: 


Az’ =A 2's Ax” =A" x". 


On forming the right-hand scalar product of the first equation 
with x”, and the left-hand of the second with x’ and subtracting term 
by term, we obtain 


(Ax’, 2”) — (%’, Ax”) = (2 — A”) (x’, @"). 


The left-hand side vanishes, since A is self-conjugate, and A’ — 4” 4 
# 0. Hence (x’, x”) = 0, and the theorem is proved. 

Solving the non-homogeneous equation (61,) amounts to finding 
the operator (A — 4 #)-!, inverse to A —AE. If A is an eigenvalue 
of A, the homogeneous equation (60,) has solutions differing from 
the zero element, and by what was said in [127], the inverse (A — 
— A E)- certainly does not exist. If 4 is not an eigenvalue of A, 
the inverse (A — 4 E£)-! does exist, but it may be either a bounded 
inverse or simply an inverse. Notice that the parameter 4 may be 
any complex number. We introduce the following definition. 
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Derinition. The value or point 1 (in the plane of the complex variable) 
is called a regular point of the operator A if A — AE has a bounded 
inverse. 

R, = (A —4B)-1, (62) 


and this linear operator R,, defined for all regular points A, ts called 
the resolvent of A. The spectrum of an operator A is the set of the points 
A which are not regular points of A. 

By what has been said, every eigenvalue of an operator A belongs 
to its spectrum. We shall see later that 4 that are not eigenvalues 
may also belong to the spectrum. 

Given any element y, if 4 is a regular point, the non-homogene- 
ous equation (61,) has a unique solution defined by 


x=(A—AE)y. (63) 


If 4 is not a regular point and is not an eigenvalue of A, (61,) also 
has a unique solution, if y belongs to the lineal R(A — 4 EZ). This 
lineal consists of the elements y given by 


y=(A—AE)x (x€H), (64) 


when x runs over the whole of H. 

Thus the inverse operator (4 — A #)-1 is defined on the lineal 
R(A — 4 BE), if A is not an eigenvalue. If A is also not a regular point 
of A, the operator (A — A £)-1 is called the resolvent of A. 

THEOREM 2. The elements R(A — AE) are orthogonal to all the 
solutions of the equation (A* —A E)z=0. 

This assertion follows at once from the obvious equation 


((A — 2B) x, 2) == (x, (A* — 2B) 2). 


Notice that, if A is a self-conjugate operator and / an eigenvalue 
(which is real), it follows from Theorem 2 that the elements #(A — A E) 
are orthogonal to the eigenelements of A corresponding to the eigen- 
value A. We shall prove in the next section several theorems charact- 
erizing the spectrum of a self-conjugate operator. 

A further point must be mentioned in regard to the eigenelements 
of a self-conjugate operator A. As we have seen, the eigenelements 
corresponding to an eigenvalue 4 = 4’ form a subspace. A complete 
orthonormal system can be introduced into this subspace. If the 
eigenvalue A = 4’ has rank 7, this orthonormal system will contain 
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r elements. We know that the eigenelements corresponding to different 
eigenvalues are mutually orthogonal. Hence, if we introduce an 
orthonormal system as indicated above into each of the subspaces 
corresponding to a fixed eigenvalue, we get an orthogonal system 
K in H. We shall say that the self-conjugate operator generates the 
orthonormal system K in H. This system is defined up to the choice 
of the complete orthonormal system in each of the subspaces 
in question. It may happen that A has no eigenvalues at all. In this 
case there will be no system K. We know that an orthonormal system 
may contain only a finite or denumerable set of elements. It follows 
at once from this that, if K has an infinite set of distinct eigenvalues, 
this set is denumerable. 

The orthonormal system A may be complete or incomplete in H. 
Its property of being complete or incomplete is easily seen to be 
independent of the choice of complete normalized system in the 
subspaces of eigenelements corresponding to the fixed eigenvalue. 
If K is a complete system, operator A is said to have a purely point 
spectrum. 


129, The spectrum of a self-conjugate operator. We shall discuss 
self-conjugate operators in this section. 

THEOREM 1. If A is not an eigenvalue of the self-conjugate operator A, 
(64) defines a lineal R(A — 2 £), complete in H. 

Let us suppose the contrary, ie. that R(A — A £) is not dense 
in H, ie. that the closure of R(A — / £) leads to a subspace different 
from H. Now, by the theorem of [122], there exists a non-zero element 
24, orthogonal to this subspace and hence to R(A — A E), ie. ((A — 
—A£)x,x,) = 0 for any x of H, or, since A is self-conjugate: 
(x, (A — 4 E) x) = 0. On putting z = (A — 4 E) x, we get || (A — 
—1AE)x,|| = 0, ie. Ax, =A 2,. If A is real, ic. 7 = A, 4 must be 
an eigenvalue of A, which contradicts the hypothesis. If A is not 
real, the equation Az, = 1 x, shows that the non-real number 4 is 
an eigenvalue of the self-conjugate operator A, which is impossible, 
so that the theorem is proved. 

If A is a regular point, R(A — A £) coincides with H. This follows 
from the definition of regular point. If 4 is an eigenvalue, all the 
elements H(A — AE) are orthogonal to the corresponding eigen- 
elements of A, and the lineal R(A — 4H) cannot be dense in H. 
We shall see later that, if 4 is not a regular point and is not an eigen- 
value, the lineal H(A — A E) is not Z (it is dense in #). 
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We now establish the necessary and sufficient condition for A to be 
regular. 

THEOREM 2. The necessary and sufficient condition for A to be a 
regular point of a self-conjugate operator A is that there exist a positive 
number p such that, for any x of H, 


|| (A — 4B) x || > pila]. (65) 


Every non-real 4, and every real A lying outside the interval [m, M], 
where m and M are the bounds of A, is regular. 

Let us prove that (65) is necessary. Let A be a regular point. There 
now exists a bounded inverse operator (62). Writing g for its norm, 
we have 

|(A — AE)“ y|| <aily|, 


or, on putting y = (A — A £) z in this inequality, we arrive at (65) 
with p = l/q. As regards the sufficiency of (65), it follows first of 
all from this condition that 4 is not an eigenvalue, so that the lineal 
R(A — iE), defined in Theorem 1, is dense in H. We shall show 
that it is closed, and therefore coincides with H. Suppose that the 
elements y, =(4—AE)az, belong to R(A—AE) and yn >y. 
We have to show that y € R(A — A E). By (65), we have || ¥, — Ym ||> 
> p||%n — Lm ||. The sequence y, is mutually convergent, and we 
can say, in view of the last inequality, that the sequence z, is also 
mutually convergent, i.e. there exists an element z such that x, => 2. 
It follows from y, =(A—AE)a, that y>y =(A—AE)2@, 80 
that y ¢ R(A — A E). The lineal R(A — A £) therefore coincides with 
H and the inverse (A — A £)-1 of (A — A £) is defined throughout H. 
To show that A is a regular point, it remains to prove that (A — 4 £)-} 
is bounded. On putting = (A — 4 E)-1y in (65), we get 


(A — AB) y|] < Syl. 


whence it follows that (A — 4 £)-! is bounded, and the first part 
of the theorem is proved. Let A=o-+ 77, where 74 0. Putting 
(A—AE)x =y, we can write 


((A — 2B) 2,2) = (y,x) and ((A — AE) a,x) = (x, (A — AE) x) = (2, y). 
On subtracting the second from the first equation, we get 


(2 — A) (x, x) = (y, @) — (a, y) 


2 [rj lz? <|(y 2)[+ |(zy) |, 


or 
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and, by using inequality (5), we arrive at 


2{t}[le\|<2|lyll, 


1.e. 
|| (A — AE) ae || > |r] |x| 
(A=o-+ 7b). 
We have arrived at (65) with p =| t|, and every non-real value 


of 4 is therefore regular. Now let A be real, but lie outside the interval 
[m, M]. Suppose say 2 > M; we show that (65) is now satisfied. 
We can write 

((A — AE) x, x) = (Ax, x) — 4|| 2 |I?, 
or 


((A — 2B) a, x) = [(Az, x) — M |] x\P2] — (A-— M) |] 2/2. 


It follows from the definition of the upper bound MM of operator 
A that the difference in square brackets is non-positive. In addition, 
4 > M by hypothesis, and the last formula gives 


|((A — aE) x, z)| > (A— M) || 2|[?. 
On the other hand, we have 
|((A — AE) x, x)| < ||(A — 2B) || - || x]. 
The last two inequalities together yield 
|| (A — 2B) @|| > (A— M) | 2, 


whence (65) follows when 4 > M, which is what we set out to prove. 
This theorem has several corollaries. 
CoRoLuaRyY 1. The necessary and sufficient condition for 1 to belong 
to the spectrum is that there exist a sequence of normalized elements 


Xt, such that 
|| (A — AE) x, || > 0. 


For, if there is such a sequence, condition (65) cannot be fulfilled 
with p> 0, so that 4 must belong to the spectrum. Conversely, 
if A belongs to the spectrum, (65) is not fulfilled for any p > 0, ice. 
there exists a sequence of normalized elements z, such that || (A — 
— i E)z,||— 0. Notice that, if 2 is an eigenvalue, we can take the 
same element as 2,, whatever the n, viz. any normalized eigenelement 
Zy. Now (A — A £) az, = 0 for any zn. 

CoRoLLaRY 2. If the lower bound m(A) > 0, A=0 lies outside 
[m, M], and A has a bounded inverse. We made use of this in [127]. 

CoROLLARY 3. The set of regular points of the real 2 axis is open. 
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Let 4 be a regular point. We have to show that, given any suffi- 
ciently small « > 0, all the points A + « are also regular. By hypo- 
thesis, there exists a positive p such that (65) holds, whence 


(A — (At €) E)@|| > || (A — AB) a || — ee || > (p—e) 2], 


from which it follows that, when e < p, every point A + « is regular. 

CoroLuaRY 4. The points of the spectrum of a self-conjugate operator 
form a closed set. This follows at once from Corollary 3 [32]. 

THEOREM 3. The values A =m and 4 = M belong to the spectrum. 

Let us prove this for A = M, on the assumption that M > m. 
We introduce the self-conjugate operator B = A — mE, having the 
bounds 0 and M, = M—m-> 0. Its norm is equal to M, [126]. 
It follows from the definition of upper bound that there exists a 
sequence 2, of normalized elements such that (Ba,, %) = My — én, 
where 6, > 0 and 6,—> 0. We have 


|| Bu, — M, x, ||? = || Ba, |? — 2M, (Bx, x,) + Mi < 
< Mj — 2M, (M, — 6,) + Mj = 2M, 4,, 


whence it follows that || (B— M,) 2, || > 0, and by Corollary 1 of 
Theorem 2, B — M, E = A — ME has no bounded inverse. 

We now show that A = m belongs to the spectrum. The bounds of 
the self-conjugate operator (—A) will be (—J£) and (—m), where 
—M < —m, so that, by what has been proved, —A + mE has no 
bounded inverse, i.e. it does not satisfy condition (65), so that A — mE 
likewise does not satisfy this condition. 

We have seen above that, if 2 belongs to the spectrum, but is not 
an eigenvalue, the lineal R(A — A) is everywhere dense in H. 
We shall show that in this case R(A — A #) is not the whole of H. 
This follows at once from the next theorem [cf. 97]: 

THzrorem 4. If R(A) ts the whole of H, the inverse operator A-! is 
bounded. 

We notice first of all that, if #(A) is H, it follows from what was 
said at the start of this section that A= 0 is not an eigenvalue, 
i.e. there exists the inverse A-1, defined throughout H. 

Let x be an element of H. We shall denote the element Az by 
the same letter with a prime, i.e. x” = Az, y’ = Ay and so on. It is 
readily seen that the operator A-1 has the following symmetry 
property: 

(A-} a’, y’) = (x, Aly’), (66) 


where x’ and y’ are any elements of H. For, (66) is equivalent to the 
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equation (x, Ay) = (Az, y), which holds because A is self-conjugate. 
Given any choice of normalized element y’, expression (66) is a linear 
functional 1,’(z’) of x’. Given any fixed element x’, the numbers 
| l)’(%’) | are bounded for this family of functionals. For, | l,’(x’) | = 
= |(A-!2’, y’)| < || 4-12’ ||. Hence it follows that the norms of 
functionals (66) are bounded with || y’ || = 1 [100]. But these norms 
are equal to || A~! y’ || [123], so that there exists a number g such 
that || A-ly’ || <q with || y’ || = 1, which is what we wanted to 
prove. 

The theorem obviously holds for the operator A — 4H with any 
real A. The case of non-real A was discussed above. We shall investigate 
the spectrum of a self-conjugate operator in more detail in a sub- 
sequent section devoted to unbounded operators. 


130. The resolvent. If 2 is not an eigenvalue of A, the resolvent 


of A exists: 
Rh, = (A — AE)". 


It is defined on (A — A £) and transforms this linea] one-to-one 
into H. It follows from the definition of inverse operator that 2 = 0 
if belongs to R(A — AE) and Ry xz = 0. 

We shall consider below the case when A is a regular point of A. 
In this case R(A — A E) is H, and R, is a bounded operator defined 
throughout H. 

Let us prove the following two formulae (for regular 4 and yu): 


R, — RB, = (u—A)B,R,. 


If 4 is not real, 2 is also not real, and hence is also a regular point. 
We can therefore assert that, for real and non-real A, any elements 2’ 
and y’ of H can be written as 


a’ =(A—AB)a; y' =(A—AEB)y. 
Hence it follows that 
(R, 2’, y') = (2, (A — 7B) y) = ((A — AE) 2, y) = (2’, Ruy’), 


and the first of (67) is an immediate consequence of this. The definition 
of resolvent implies that 


Ry,x=Rh (A—pE) Rc; Rye = h,(A—AE) Re. 


On subtracting term by term, we arrive at the second of (67). 
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131. Sequences of operators. Everything said about sequences of 
linear operators in B spaces [104] holds for H. Let us recall the basic 
facts and make some additions. The convergence in norm of a sequence 
of linear operators A, to a linear operator A is defined by the con- 
dition || A — A,||—>0 as n-+co. The necessary and sufficient 
condition for this is that || 4, — Am||— 0 as n and m—> ©, 

Strong convergence (or simply convergence) is defined by the fact 
that A,xz =» Ax for any x € H. The norms || A, || are bounded. The 
necessary and sufficient condition for strong convergence is || 4,7 — 
— A, || > 0 as n and m— 9, for any x € H. Convergence in norm 
implies strong convergence. If A,-» A, B,— B (in the sense of 
strong convergence or of convergence in norm) and the numbers 
a,-> a, then a,4,—> aA, 4,+ B,2>A-+ Band B,4A,—> BA. 

If A, are self-conjugate operators, and A,-—»> A, then A is also 
self-conjugate. For, (A,2, 2) is real for any n and any xz € H. Hence 
it follows that (Az, x) = lim (A,2, x) is real for any x € H, so that 


A is a self-conjugate operator. 
Since we possess the concept of limit, we can consider infinite 
series of linear operators in H, e.g. 


BB eis 


and speak of their convergence in one sense or another. The following 
example will be important later. Let A be a linear operator and 
|| A |] =@ < 1. We form the series: 

S=H-+aA+e@A?-+..., (68) 


where a is a complex number. On writing S, for the operator equal 
to the sum of the first n terms of this series, we have 


|Sn+p = S,,|| = | atti Ant1 1 ght2 qnt2 4. 4. ghtp—l Antp-1!| 
whence, assuming |a|{ < 1, 
n+1 
| Sntp—Sall<grtitgret... = Tog: 


so that || Snip — Sn ||— 0 as n> co for any p> 0, i.e. series (68) 
is convergent in norm when |{a| <1. Since the upper bound of 
I| Sn+p — Sn || does not contain a, series (68) is said to be uniformly 
convergent with respect to a for |a|< 1. Since || A||=g¢q< 1, 
we can say that series (68) is uniformly convergent in norm with 
respect to a if |a| << 1-+ ¢, where « > 0 is chosen so small that 


(lt+e)q<l. 
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On multiplying series (68) by (Z — a A) and taking into account 
what was said previously about passage to the limit for a sequence 
of operators, we get (ZH — a A) S = S(H —a A) = #, ie. the sum of 
series (68) is the bounded inverse of the operator (Z — a A) for 
la]<l+e,ie. 8 =(H—a A)-1. 

The proof of the following statement is similar to the above: if the 
norms of operators A; do not exceed positive numbers 6;, which form 
a convergent series, the series 


A=A,+A4,+... 


is convergent in norm, and the norm of A is not greater than the 
sum of the series formed from the 6,. The last assertion follows from 
the fact that, if the norm of A were greater than this sum, the norm 
of the operator 


S,=A,+A,+...+4, 


would also be greater than this sum for sufficiently large n, and this 
contradicts the obvious inequality 


S|] <|{ 4a }] + |] ef] + -.. + | 4, |] <6, +6, -+...4+46,. 


A proposition similar to the above obviously also holds for normed 
spaces. 


132. Weak convergence. Since we possess the general form of 
linear functional in H [123], the weak convergence z, “ x, is equi- 
valent to the fact that (2p, y)—> (%,, y) for any y € H. We recall 
that z,-% 2, implies the existence of a number m > 0 such that 
|| %, || < m for all values of n. Further, since the conjugate space H* 
coincides with H, every bounded set in H is weakly compact, and 
H has weak completeness, i.e. if (%, — 2m, y) > 0 as n and m— co 
for any y € H, the sequence z, is weakly convergent. We also know 
that, if z, “% %, and Aisa linear operator, Az, ad Az,. Let us now 
show that, if 2, (6 = 1,2, ...) is an orthonormal system, complete 
in H, and || a, || < m, the proof that 2, “% a, only requires a proof 
that (ap, 2) > (2; 2x) (bk = 1, 2, ...). For, let (2p, 24) —> (Zo, 2) (kK = 
= 1,2,...). Any element y € H can be written as 


y = > b,%; 
k=1 


where 


oo 


>! Kl? = lyr: 


k=l 
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We write 
N oo 
(Xp — Xo, y) = (x, — X Ds by, 24) aa (x, — %o, by, 2x) ’ 
k=1 k=N41 
and take any given « > 0. Since || z, || < m, we have [121]: 


I(t, — Fo, SF bee)|<|e2—o|]- |] S WRxIl< 
kaN+1 koN+1 


<(imi+iizoi)|) Sleek 


and we can fix an N such that the right-hand side of this inequality 
is < ¢/2. Now, 


(2p, — Xo, by 2%) | +> - 


Mz 


| (pq — %q, ¥) | < | 


~ 
L 


Since (%p, 2.) —> (%, 2) for all sufficiently large n, the first term 
on the right-hand side is < «/2 and || (%, — 2%, y) || < ¢, whence it 
follows that (ap, y) —> (2, y). 

Let us prove the following theorem. 

THEOREM 1. If 2%“, and Yn => Yo; then (Xn, Yn) > (Lor Yo) and 
(Yns:En) —> (Yo, Vo). It is sufficient to show that (2p, Yn) > (%p, Yo). The 
seco nd assertion is obtained by interchanging the elements. We can 
write 

| (%q, Yo) ~~ (%n, Yn) | < | (Xp, Yo) — (Zp; Yo) | a (Xn, Yo) = (2p; Yn) | ’ 
or 
| {%os Yo) — (Ls Yn) | < | (os Yo) — (ns Yo) | + | (Fas Yo — Yn) |- 

On recalling that x, “ x, implies the existence of an m > 0 such 
that || 2, || < m, and using Buniakowski’s inequality, we get 

| (%o» Yo) — (2 a» Yn) | < ™ |] Yo — Yn || + | (®or Yo) — (%p» Yo) | 

The first term on the right > 0, since y, => y, whilst the second 
— 0 because 2, “% x. Therefore (Xp, Yn) —> (2 Yo), and the theorem 
is proved, 

THEorEM 2. If 2, “ x, and || an || > || x ||, then %2—> Zo. We have 

| % — Xp lI? mae I X II? os | tp II? a (Lp, 2) 75 (Xp, Lp) + 

It follows from the hypothesis that (2,, 2) — || % |!?, (@. fn) > 
—> || % ||? and || z, {/?—> || xq |/?, so that || a) — a, ||? 0, which is 
what we needed to prove. 

As indicated in [101], this theorem also holds for certain B spaces. 
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133. Completely continuous operators. We have had the definition 
of completely continuous operator for a B space and hence for H. 
A completely continuous operator in H is a linear operator such that 
it transforms every set bounded in H into a compact set. 

We know that every linear operator transforms a compact set into 
a compact set. Notice that the operator of the identity transformation 
is not completely continuous. It transforms the sphere || 2 |j <1 
(a bounded set) one-to-one into itself, and such a sphere is not compact. 
To see this, we only need to take an infinite orthonormal sequence 
of elements z, (k = 1, 2, ...). It is bounded, since || z, || = 1, but not 
compact, because || 2, — 2, || = 2 for p # q. 

Two new definitions of completely continuous operator will be 
given below, and their equivalence to the above fundamental def- 
inition will be proved. A simple preliminary remark is required. 

If, given two sequences 2, and y,, one is weakly and the other strongly 
convergent, to z%) and yy, and if A is a bounded linear operator, 
we have 

lim (Az, Yn) = (Ax, Yo). (69) 


nooo 


This follows at once from Theorem 1 [132] and the fact that, if 
rae 2», then Ax, > Az, and if %,=> 2, then Ax, =» Az,. We shall 
now give two new definitions of completely continuous operator. 

Deriition 1. A linear operator A is said to be completely con- 
tinuous if (69) holds for any sequences x, and Yn, weakly convergent to 
Ly and Yo. 

DEFINITION 2. A linear operator A is said to be completely continuous 
if t “> x, implies Aa, => Axo. 

Let us show that these definitions are equivalent. Let A satisfy 
condition (69) for the weakly convergent sequences z,, and y,. We can 
write 


|| Az, — Az ||? = (Aw, Aw, — Amy) — (Aa, Au, — AX). 


If 2, > 2, then Az, — Az, > 0, and both terms on the right- 
hand side tend to zero by (69), i.e. || Az, — Ax, || > 0 or Ax, = Ap. 
Thus the second definition follows from the first. Now suppose that 
tn > #, implies that Aa, => Ax,. Formula (69) now follows at once 
from Theorem 1. Having proved that these two definitions are 
equivalent, they must both be equivalent to the basic definition if 
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it is shown that Definition 2 is equivalent to the basic definition. 
Let A satisfy Definition 2 and x, be a bounded sequence of elements. 
We can choose a subsequence z,, such that Tn, —> 2B, and hence, 
by Definition 2, Az,,— Az, i.e. the set Az, is compact, and the basic 
definition follows from Definition 2. Suppose conversely that A 
satisfies the basic definition, and let 2, 2). We have to show that 
Az, => Ax. We use reductio ad absurdum. Let Ax, not = AZ, 
ie. there exists a subsequence of subscripts such that || Az, — 
— Ax, || > a> 0. By the basic definition, the set Az,, is compact, 
and we can assume that Az, is strongly convergent to some element 
2’, which, since || Az,,— Az,||>a> 0, must differ from Az». 
But 2,» Zp, and consequently Ag, “+ Axo, whilst by the foregoing, 
Attn, => & # Axy, and all the more, Azn, > a’ # Axy. We have 
arrived at a contradiction. 

Thus the new definitions of completely continuous operator are 
equivalent to the basic definition. We shall explain later the concepts 
of weak convergence in /, and Ly. 

THEOREM. If A is a linear operator and A*A is completely continuous, 
A is also completely continuous. Let x, (n = 1, 2,...) be a bounded 
sequence of elements (|| 2%, || < a). By hypothesis, A* Az, is compact, 
i.e. there exists a convergent subsequence A*Az,,. Let us show that 
Azn, is also a convergent subsequence, whence the theorem will 
follow. We have 


|| A an, — Aan, ||? = (A* A (en, — Inj)» Cay — Tn.) < 
< || A* Arp, — A* A&n, || + || Gay — Ln, || < 2a || A* Atn, — A* AZp, ||. 


In view of the convergence of the sequence A* A&n,, ‘the right-hand 
side —> 0 as mn, and n,—> co, so that || Az, — Atm ,|| > 0, ie. Aap, 
is a convergent sequence. 

CoroLiary. If A is completely continuous, A* is also completely 
continuous. 

If A is completely continuous, AA* = (A*)*A* is completely 
continuous; but now, by Theorem 2, applied to A*, A* is also com- 
pletely continuous. 

Let us recall the following property of sequences of operators: 
if a sequence A, of completely continuous operators is convergent 
in norm to a linear operator A, A is also a completely continuous 
operator [106]. A special class of completely continuous operators 
must be mentioned. 
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Derinition. A linear operator D is said to be finite-dimensional if 
it can be written in the form 


Dz = > (ZL, Vy) Ups (70) 


where u;, and vy, (k = 1, 2, ...,m) are fixed elements of H. 

It may easily be seen that a finite-dimensional operator is com- 
pletely continuous. For, 2, a Ly implies that (Xp, Vg) —> (Xo, Ux), and 
by (70), Dr, => Dzy. 

It follows at once from what has been said that, if A, is a sequence 
of finite-dimensional operators, tending in norm to the linear operator 
A, then A is completely continuous. 

It will be shown in the next section that every completely con- 
tinuous operator can be written as the limit in norm of a sequence of 
finite-dimensional operators. 


134. Spaces H and /,. Let 
A oe ee (71) 


be a complete orthonormal system in H. By using it, we can map H 
one-to-one into space J,, the elements of which are infinite sequences 
of complex numbers (&,, &, ..-), on condition that the series 


DI (72) 
k&=1 


is convergent [121]. Any element x ¢ H is characterized by its 
Fourier coefficients: &, = (2, 2,), and we have the form 


r= Dbz. (73) 
k=l 


Conversely, if an element (é,, &, ...) of 2, is given, series (73) is 
convergent in H and yields the corresponding element of H. This 
correspondence is one-to-one, the scalar product in H being equal to 
the scalar product of the corresponding elements of J, [60, 121]: 


(x, y) = oa En Nk ’ 
k=l 
where x corresponds to (é,, &, ...) and y to (m,, 7, ..-). Thus || @ || 


is equal to the norm of the corresponding element in J, and con- 
vergence in H and 1, is equivalent. We thus have an isomorphic 
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mapping of H into /,. The following elements of 7, correspond to the 
elements 2, of the orthonormal system (71): 
(1,0,0,0,...); (0,1,0,0,...); (0,0,1,0,...); 


Let &(&, &, ...) be an element of /, and BOE bie weno Opens) 
be the cut-off element, in which the first n components are equal 
to the corresponding components of &, and the remainder equal to 
zero. We have 


E— El? = I Lee, 
k=n+l 


and, since series (72) is convergent, E™ —» E in 1,. 
Let A be a linear operator in H. In view of its continuity and (73), 
we can write 


y= At= > &, Ay. (74) 
k=1 


The components (7, ™%,...) of the element of /, corresponding 
to the element y, are defined by 


m= (ys %) = > & (At 2)- (75) 
k=1 


We have made use here of the continuity of the scalar product. 
On introducing the numbers 


Ay, = (Az, 2), (76) 


a linear operator A in H is seen to correspond with the operator 
in J: 


= > On Ek (77) 
k=l 
(ate ees 


which is defined by an infinite matrix with elements ay, = (Azz, 2). 
The conjugate operator A* corresponds to the matrix with elements 


i.e. 
Ai, = Ay. (78) 


A self-conjugate operator is characterized by the equation 


Ay = Ayx- (79) 
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We introduce the set L of elements of H expressible as 


where €, are any complex numbers and m is a fixed positive integer. 
We have &, = (x, 2,), and L is easily shown to be a subspace. The 
subspace M orthogonal to it is obviously the set of elements 2 
expressible as 


where & are complex numbers such that the series 


oD 


> lel? (80) 


k=m+1 


is convergent. Space H can be written as [122]: 


H=L@M. (81) 
Writing P, and Py for the projectors into L and M, we have 
E=P,+ Py. (82) 


Let A be a linear operator. We bring in the two operators: 
A,=P,A; A, = Py, A. (83) 
By (82): A = A, + A,. Since P, Ax € L for any x ¢ H: 


m 
Py Ax = > 
k=1 


where 
ay == (Py Az, Zr) => (Az, Py 2x) = (Az, Zr) = Ca A* Zr). 
i.e. 
m 
P, Av = Aye = S (x, A*%) (84) 
k=l 


whence it follows that P, A = A, is a finite-dimensional operator. 
Similarly, we have 


oo 


At= 3S (Aty)y% = S (Ata) x (85) 


k=m+4+1 k=m+1 


Thus the element A, 2x corresponds to an element (£,, &, ...), the 
components of which are given by 


£,=0 for k <m and &, = (Az, z,) = (x, A*z,) for k > m. (86) 
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Now let A be a completely continuous operator, and U a set of 
normalized elements (|| z || = 1). Now, ifz € U, Axis a compact set, 
and hence the corresponding set in 1, is compact. The components 
of the elements of this set are given by &, = (Az, 2), and since it is 
compact, we can say that there exists for any normalized z a positive 
number C such that 


oo 


> (Az, 2)? < ¢ 

k=l 
and that, given any « > 0, there exists a positive integer m, such 
that [92] 


S (Ar, a) 2 <e?. 
k=mgtl 


But it follows from (85) that 
[| Ae 2|)? = PP (Az, 24) |, 


k=m+1 
so that, given any « > 0, there exists an m = n, such that || A, x {| < 
< «for || z|| = 1, i.e. || A, || < «. We have arrived at the following 
theorem. 

THEOREM 1. If A ts a completely continuous operator and « > 0 is 
any given number, there exists a positive integer m such that || A, || < «, 
where A, is the operator defined above. 

If we take a sequence of positive numbers ¢, tending to zero, we 
get a sequence of finite-dimensional operators A¢” such that || 4 — 
—~ A || tends to zero, ie. every completely continuous operator 
is a limit in norm of finite-dimensional operators. 

On recalling what was said in [133], we can assert that the following 
definition of completely continuous operator is equivalent to the 
original (a linear operator transforming every bounded set into a 
compact set). 

DEFINITION. A linear operator is said to be completely continuous 
if it is the limit in norm of a sequence of finite-dimensional operators. 


135. Linear equations in completely continuous operators. Let 
us consider the solubility in space H of equations of the form 


x— Axr=y, (87) 
x— A*¥xr=y, (88) 


where A is a completely continuous operator, A* the conjugate to 4, 
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y a givenelementand zx the required element of H. As was shown by 
Riesz, the fundamental theorems of the theory of integral equations 
(the Fredholm theorems) remain valid for equations (87) and (88) in B 
spaces as well as in H space (as proved in [107]). Let us investigate 
these equations in H. 

We fix the number m appearing in the formation of operators A, 
and A, of the previous section so as to have || A, || < 1.Then || AZ || < 
< 1. Thus 


| Aal| = |]Agi] < 1, (89) 

and equations (87) and (88) can be rewritten as 
(E—A,)e—A,r=y, (90) 
(E— At)x— A¥x=y. (91) 


By (89), the operators (E — A,) and (E — Af) have bounded 
inverses [131]. 
We introduce the following notation: 


= (E— A,)a; y=(E — Af) *y. (92) 
We rewrite (90) in terms of 2 instead of x, and apply the operator 

(EZ — A})-! to both sides of (91). This gives us 
i — Br=y, (93) 
a— Bx=y, (94) 

where y is the given element and 
B=A,(E—A,)7; B* = (HE — Ak) Af, 

B* being the conjugate to B. Equation (94) is equivalent to (91), 
and solving (90) amounts to solving (93) and using the formula 
x = (£ — A,)-1&. Solving equations (90) and (91) thus amounts to 
solving (93) and (94). Let us also write the corresponding homo- 


geneous equations: 
95 
x— BRe=0. 
The operator B = P, A(Z — A,)~+ is finite-dimensional, and Bz 
is given by (84) after A has been replaced by A(H — A,)-1. The 
matrix corresponding to operator B in 7, will have the elements 


ayy = (P, A(E — A,)~*z, 2x) = (A(E — A,)~? 2, PL %). (96) 


But P, 2, = 0 for k > m, so that a;,; = 0 for k > m. Let é, and 
nm be the components of elements of 1, corresponding to the elements 


ee 


135] LINEAR EQUATIONS IN COMPLETELY CONTINUOUS OPERATORS 407 


z and y of H. Equation (93) becomes in 1,: 


E— > E = Ny (97) 
ky m) 


é, = Np, (98) 
(k=m+1,m-+2,...), 
where &, are required, and 7, are given numbers. Hence all the E, 
are known for k > m, and the solution of (93) amounts in /, to solving 
a system of m equations with m unknowns &, (k= 1,2,. nay Mb)S 


7 m ~ 2 
be a =e + LD tat: (99) 
i=1 f=m+1 
The matrix corresponding to B* is a}, = ay, [134], so that (94) 
has the form in 1,: 
m 
Si > in F = Ms (100) 
(ea1, 2,0) 
where £,and. 7, are the components of the elements of 1, corresponding 
to x and y of H. 


Each solution (£0), &, ..., £0) of the first m equations of the 
system 
m 
Ey — > Oy F = Ne (101) 
(wa L 2, ...,!) 


leads to a corresponding single definite solution (Eo, EON oes EO), 
£0 ,...) of the entire system (100), whatever the remaining 9 
(k=m+1, m+ 2, ...), in which the remaining unknowns é, 
(k=m+1, m+ 2, ...) are defined by the formulae 


f= I oi > MK J (102) 


{=1 
(k=m+1, m+2,...). 


Notice that the &, obtained from (98) and (99), and the &, from 
(101) and (102), are such that the series with general terms | é;, |? 
and | &; |? are convergent. This follows at once from (98) for E,, and 
from (102) for é,, if we take into account the convergence over k 
of the series with general terms | 7, | and | @, |?. The last follows 
from the fact that, by (96): 
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In the homogeneous case we have to put y, = y, = 0. The homoge- 
neous system (99) can be rewritten as 
a m 
by > ayy & =0 (103) 
(kel, 2,444, ™) 


and {, = 0 for k > m; and system (101) as 


m 
E, —  ay, € = 0 (104) 
ia echt 
b= Say §; for k>m. (105) 
j=1 


Notice that the linearly independent solutions of the finite homo- 
geneous system (104) generate, by (105), linearly independent solutions 
of the entire homogeneous system of an infinite number of equations, 
corresponding to the homogeneous equation (94) in H. The linearly 
dependent solutions of system (104) generate linearly dependent 
solutions of the entire system. On recalling the basic results regarding 
the solutions of systems of equations, and the fact that the matrices 
of the coefficients of systems (103) and (104) have the same rank, 
we get the following theorem: 

THEOREM 1. Non-homogeneous equations (87) and (88) are soluble 
with any right-hand sides y when and only when the corresponding 
homogeneous equations (y = 0) have only the zero solution. In this case 
the solutions of (87) and (88) are unique for any given y. The homogeneous 
equations x — Ax = 0 and x — A*x = 0 have the same finite number 
of linearly independent solutions. 

We now consider non-homogeneous equation (87) in the case when 
the homogeneous equation has non-zero solutions; in fact, we prove 
the following theorem. 

THEOREM 2. The necessary and sufficient condition for the non-homo- 
geneous equation (87) to have a solution in this case is that the right-hand 
side y be orthogonal to ail the solutions of the homogeneous equation 


¢=Ar 2 = 0, (106) 


Necessity. We shall give the proof without having recourse to J). 
Let (87) have a solution 2, ie. 2) — Av, = y, and let z be any 
given solution of (106), ie. 2 — A*z = 0. We have to show that 
(y, 2) = 0. We have 


(y, 2) = (2 — Atty, 2) = (ag, 2 — A*z) = (%, 0) = 0. 
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Sufficiency. Given that y is orthogonal to all the solutions of (106), 
we have to show that (87) has a solution. On passing to l,, we have 
by hypothesis: 


MS, = 9; (107) 


WE 


k=l 


where (€,, &,.--,&m) is any solution of system (104) and the &; 
for k >m are given by (105). On substituting these expressions 
for when k > m, we can rewrite (107) as 


> (Me + > O17) Ey = 0. 


k=1 tl=m+1 


Since the sums in the curved brackets are the right-hand sides of 
equations (99), whilst (&,, &, ..., &m) is any solution of system (104), 
we can say that system (99) has a solution [III,; 15], so that equation 
(87) has a solution, and the theorem is proved. 

We now consider the equation 


r—-pAr=y, (108) 


where A is a completely continuous operator and yp is a complex 
parameter. The operator A is also completely continuous, and the 
theorems proved above are applicable to equation (108). In particular, 
(108) is soluble with any y (and uniquely so), if the homogeneous 
equation 


x—pAx=0 or At=Azx (a——) (109) 
has only the zero solution (this is obvious with » = 0). If (109) has 
non-zero solutions, the corresponding A is an eigenvalue of the 
operator A. We now prove the following: 

THEOREM 3. There can exist only a finite number of eigenvalues, 
satisfying the condition | A| > 1, where r is any given positive number. 
In other words, we have to show that there can only exist a finite 
number of values of uw satisfying the condition | «| < 1/r for which 
(109) has non-zero solutions. The proof of this assertion is directly 
bound up with the construction that we used in proving Theorem 1. 

As in Theorem 1, we put 


where we fix m so large that || A, ||/7 = q¢ < 1. The operator (HF — 
— p A,) now has a bounded inverse for | u | < l/r, and it is express- 
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ible by the series 
(H — pA, t= H+ wd, + we AZt+..., (110) 


which is uniformly convergent in norm with respect to p for |u| < 
< (I/r) + e [181], where « is a sufficiently small positive number. 
The values of » for which the equation has non-zero solutions are 
found by equating to zero the determinant of system (103), i.e. the 
determinant 4 with elements 6,; — a4, where 6,; = 0 for k 41, and 
64, = 1 for k = 1, and 


Oy, = (P, wA(E — wAy)~* 2), 2). 


In view of the convergence of series (110), our remarks about 
passage to the limit for a sequence of operators, and the continuity 
of the scalar product, we can assert that a,; are regular functions in 
the circle |u| < l/r. The determinant 4 obviously has the same 
property, so that the equation 4 = 0 has only a finite number of 
roots satisfying | «| < 1/r, which is what we set out to prove. 

This last theorem can be alternatively stated as: the eigenvalues 
A of a completely continuous operator can only have A= 0 as a 
limit point. 

It follows from what has been said that the rank of any eigen- 
value A satisfying | 4 | < 1/r does not exceed the number m in system 
(103) with the condition || A, ||/7 = ¢ < 1. If A is not self-conjugate, 
it may not have any eigenvalues [IV; 13]. 


136. Completely continuous self-conjugate operators, We invest- 
igated the properties of the spectrum and the expansion in eigen- 
functions of a completely continuous self-conjugate operator in 
[IV; 38, 39]. All the proofs can be carried over without change to 
space H. But we have postulated that H is complete, and this fact 
was not used in the proofs of Volume IV. Thus new results may be 
obtained for H. Let us first state a theorem which is obtained from 
the results of Volume IV. Remember that all the eigenvalues of a 
self-conjugate operator are real. 

TxHEorEM 1. Every self-conjugate completely continuous operator A, 
‘different from the annihilation operator, has atleast one eigenvalue different 
from zero. All the eigenvalues of A have a finite rank and only a finite 
number of eigenvalues can lie outside any interval [—e, +e], where 
e > 0. Every element of the form Ax (x € H) can be expanded as a 
Fourier series in the orthonormal system of eigenelements x, corres- 
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ponding to the eigenvalues that differ from zero: 


Ax = S (Ax, 2X) T= [> (&, Ly) Ay, Vy. (111) 
k k 
Sum (111) can contain either a finite or an infinite number of 
terms. Further, it may be recalled that the eigenvalues 4, and eigen- 
elements x, that form the orthonormal system are obtained from 
the solution of successive extremal problems for the quadratic form 
(Az, x).This provides the basis for the proof of the fundamental theo- 
rem in Volume IV. 
Suppose that sum (111) contains an infinite number of terms. 
Let x be any element of H. We form the difference: 


oo 


z= — YS" (L, Ly) Ly. (112) 
k=l 


The series written is convergent [121]. By (111), 


A[x — > (x, 2) #,] = 0. 


k=1 
It will be seen from this that z satisfies the equation 


Az=0, for Az = Oz, (113) 


i.e. 2 is either the zero element or the eigenelement of A correspond- 
ing to the eigenvalue 4 = 0. Let z,, 2,, ... be a complete orthonormal 
system of eigenelements corresponding to A = 0. If A = 0 is not an 
eigen value, there will be no such elements. If A = 0 is an eigenvalue, 
it can be either of finite or infinite rank. Since z is a solution of equa- 
tion (113), we can say that 


L— SS (2, Ly) Ly = SC 2p, (114) 
k=l 1 
where 


Cy = (2 — = (©, Xp) Lys 2) 
k= 

or, since (2,, 2;) = 0 [128], we get cy, = (zx, 2), and it follows from 
(114) that any element z can be expanded in a Fourier series in 
eigenelements of A, these elements being those corresponding to the 
eigenvalue 4 = 0. We have thus proved the following. 

THEOREM 2. The orthonormal system of eigenelements of a completely 
continuous self-conjugate operator is a complete system. 
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In other words, we can say, using the terminology of [128], that 
a completely continuous self-conjugate operator has a purely point 
spectrum. 

The whole of the above discussion is applicable to the case when 
sum (111) consists of a finite number of terms. If A = 0 is not an 
eigenvalue, sum (111) consists of an infinite number of terms (axiom 3), 
and we have for any element of H: 


T= SS (L, L,) Ly. (115) 

k=l 
Note. As in the case of integral equations [IV; 29], we have 
the following result for a self-conjugate completely continuous opera- 
tor A. If A is not an eigenvalue or zero, the non-homogeneous equation 


Az=dr+y (116) 


has a unique solution with any given y, defined by 
= re PK(Ys Tk) 


If A is an eigenvalue and the condition for the equation to be 
soluble is fulfilled, i.e. y is orthogonal to all the corresponding eigen- 
elements, the general solution of (116) is given by (117), in which 
all the factors for the z,, in which the denominator vanishes, have 
to be replaced by arbitrary constants. Suppose that there are both 
positive and negative eigenvalues. We enumerate the former, denoted 
by 44, in order of non-increasing absolute value, and similarly for 
the latter, denoted by 4, , and let 2; and 2, denote the corresponding 
eigenfunctions. We have, in view of expansion (111): 


(Aa, x) = Sag |(, 2)? + IS Aic| (@, ee)? (118) 
k k 


whence the new statement follows at once of the extremal properties 
of A, and x,, mention of which was made above [cf. IV; 26]. 

THEOREM 3. The eigenvalue Aj is the greatest value of (Ax, x) when 
i || = 1, and it is attained when x =a}, whilst the eigenvalue Az 
(n > 1) ts the greatest value of (Ax, x) on condition that 


lla|| = 1 and (x, at) = (2, af) =...= (2, a;1)) = 0, 
and it is attained when x = 27. 
Similarly, Ay is the least value of (Az, x) when || x || = 1 and is 


attained when x = x1, whilst A, (n > 1) is the least value of (Az, x) 
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on condition that 


|] = 1 and (x, a7) = (x, 2g) =...= (x, %_4) =0, 


and is attained when x = 2). 
Let us now prove Courant’s theorem [IV; 187] for space H. 
THEOREM 4. Let 2,, 2, ..-,;%n-, be any fixed elements of H and 
M2, 2, ++; %n—) be the strict upper bound of the values of (Ax, x) on 
condition that 


||z|| = 1 and (2, 2.) = (%, 2) =.--= (%, 2,4) =O. = (119) 


Now, un is the least of the numbers m(2,, 2, ..-, %n—) for all possible 
choices of elements 2,, 2, .--,2n—-;. The proof is similar to that of 
[IV; 187]. We have m(a7, Coit me yy) = in and it remains for us 
to show that, given any choice of z, (k = 1, 2, ...,n — 1), 

(24, 295+ +++ Zax) > MA (120) 


We shall seek the element x subject to conditions (119) in the 
form 


n 
T= Sy, Uz - (121) 
k=1 


Conditions (119) may be written as the following equations for c,: 


n 
de (ai,.z,)=0, ($=1,2,....2-1) (122) 
k=l 
n 
>\e2=1. (128) 
k=1 


The non-homogeneous system (122) of (n — 1) equations with n 
unknowns c, has non-zero solutions. By adding a constant factor to 
such a solution, we can also satisfy condition (123). We have thus 
found the element of form (121) satisfying conditions (119). We have 
for this element: 


rn n n 
(Az, 2) =( Sag xe, > ey tk) = > AE |e; !?. 
k=l k=1 k=1 


On observing that pf? > wt >... > wa, and using (123), we get 
(Az, z) > pa, whilst x satisfies conditions (119). All the more, m(z,, 
Za, -.+)2n-4), which is equal to the strict upper bound of (Az, 2) 
with conditions (119), is not less than pz. Hence we have inequality 
(120) and the theorem is proved. The theorem for A, is similar, 
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Note. It may easily be shown that the strict upper bound m(2,, 
Zo, +++) 2n—) Of (Ax, x) ts attained on an element x, satisfying conditions 
(119). 

For, there exists by hypothesis a sequence of elements y,, satisfy- 
ing conditions (119), such that (Ayn, Yn) —> m/(Z,, 2, ..-, Zn). Since 
ll Yn || = 1, we can assume that the y, are weakly convergent to some 
element 2,, where the weak convergence implies [132]: 


Io] < 1 and (xo, 2) = (Xo, %q) =. . - = (Loy Zp) = O- 


In view of the complete continuity of A, we have (Az), 7) = 
= M(2,, 2, »++, Zn-4). It remains to show that || 2x, || = 1. It follows 
from m(z,, 2.) » ++; Zn) > Ma that m(z,, 25, ...; Zn—)>O and |] x, {|> 0. 
If |] x || < 1, by introducing the normalized element y) = 2y/|| Zp ||; 
satisfying conditions (119), we should have obtained 


M(Ry, Zq,0+* Spr) 


> > M(2, Zo,.. «Z,_4). 
[[zoll? (24 2) n 1) 


(Ayo, 90) = qgqF Alto» %) = 

But, by the definition of m(z,, 2, ...,%n-,), we have (Ayp, yy) < 
< M(z,, 2, »++) Zn-4). The contradiction obtained shows that || x, || = 
= 1, and our assertion that (Az, x) attains its strict upper bound 
M2, 2) +++,» %n-y) is proved. Using Theorem 2 of [132], we can say 
that yn => Xo. 

The theorem is applied when comparing the eigenvalues of different 
operators [cf. IV; 188]. 

Notice also a direct consequence of (118) [cf. IV; 26]. The necessary 
and sufficient condition for a completely continuous self-conjugate 
operator A to be positive ((Az, xz) > 0 for z € H), is that it has no 
negative eigenvalues. 

We shall now prove that a completely continuous self-conjugate 
operator is fully defined by the nature of its spectrum, which is 
described in Theorems 1] and 2. 

THEOREM 5. Let a linear self-conjugate operator have the following 
properties: the orthonormal system of its eigenelements x, (k =1, 2, ...) 
is complete, all the non-zero eigen values 2, have finite rank, and only 
a finite number of eigenvalues can lie outside any interval [—e, +¢], 
where e > 0. The operator A is now completely continuous. 

By hypothesis, we can arrange the 4, in order of non-increasing 
absolute value: 


JA] > [Aa] > [asl > (124) 
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and 4, —> 0 as n—> co. Remember that, if an eigenvalue has rank r, 
it figures r times in sequence (124) (the eigenvalue A = 0 can have 
infinite rank). 

Since the system of x, is complete, we have the Fourier expansion 
for any element x € H: 


z= Yay, %, (125) 
k=1 
and 
Az = Say A, Xp (126) 
k=1 


Let U be a bounded set of elements 2, i.e. there exists a positive 
number J such that 


s ja? <P, (127) 
k=l 
if « € U. We have to show that the set Az is compact. It is bounded 
by virtue of || Az || < 24 J. It remains to show [92] that, if (127) is 
satisfied, given any « > 0, there exists a positive m, such that 


> |a,|? Az < e. 
k=me 
Since 2, > 0 as n—» co, there exists an », such that | A, | < é/l 
for n > n,. Now, 


oo 2 sacs 2 
> |Aul? Ak < a > a < - P= &, 


k=rg k=Nng 


and the theorem is proved. 


137. Unitary operators. Along with self-conjugate operators, we 
must consider a further class of linear operators. 
DeEFIniTion, A linear operator 


y= Ux (128) 


is said to be unitary if it does not change the norm of an element, i.e. 
|| Ux || = || @ ||, and transforms H into the whole of H, i.e. given any 
y € H, there exists a pre-image x, i.e. an element x such that (128) holds. 

Notice that, by the definition, the norm of a unitary operator is 
equal to unity. The basic properties of unitary operators are given 
in the next theorem. 
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THEOREM 1. A unitary operator transforms H one-to-one into itself, 
has a bounded inverse, defined by 


U- = U*, (129) 
UU*=U*U=E, (130) 


where U- is also a unitary operator, and does not change a scalar 
product. Condition (130) is sufficient for U to be unitary. 

If x, and x, are two elements of H, by definition of unitary operator, 
we have || Uz, — Uz, || = || U(x, — x.) || = || % — % ||, so that, if 
Ux, = Ux, then x, == %, i.e. given different elements x, (128) gives 
different y, ie. U defines a one-to-one transformation of H into 
itself. Hence there exists a bounded inverse U-1, defined throughout 
H, where, since U does not change a norm, we have || U-1y || = 
= || y ||, ie. U-1 is also a unitary operator. Since the norm is invari- 
able, we can write (Uz, Ux) = (x, x), whence it follows immediately 
that 

(UT * Ug a) = (2,2), 


But if two quadratic functionals are equal, the operators appear- 
ing in them must be equal, ic. U*U = E, whence it follows that U* 
is the left-hand inverse of U, and, since a bounded inverse operator 
exists for U, we also have UU* = E, and (129) is proved. The 
assertion that U does not change the scalar product follows at once 
from 


(Ux, Uy) = (U* Uz, y) = (2, y). (131) 


Finally, let us show that (130) implies that U is unitary. By (130), 
U has a bounded inverse, defined by (129). It remains to show that 
U does not change a norm. By (130), this follows from (131) with 
y=. 

Notice also that, ifU, and U, are two unitary operators, their product 
U, U, is also a unitary operator. This is an immediate consequence 
of the fact that, if U, and U, transform H one-to-one into H and 
do not change the norm, their product obviously has the same pro- 
perties. The inverse of a unitary operator is therefore unitary, and 
the product of unitary operators is unitary, i.e. unitary operators form 
a group. 

Let 

Wij Var 0 gy iw (132) 


be a closed orthonormal system. On applying the unitary trans- 
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formation U to it, we obtain, in view of the properties of U, the 
orthonormal system 


y, = Ux; y,= UX; ys = UR; ... (133) 
An element x has the expansion in elements of this system: 
Y= A,X, + Ag %e + Ag%3 +.-- (134) 


so that the transformed element Uz has an expansion in elements 
of system (133) with the same coefficients: 


Ux = 4,4 + Og Ye + 3 Y3 1 --- (135) 


The element Uz may be any element of H, so that system (133) 
is also closed. Conversely, if, given two closed orthonormal systems 
2, and y;, (k = 1, 2,...), we define an operator U for any element x 
having form (134) by (135), this operator transforms H one-to-one 
into H without changing the norm: 


ae? = |||? = = a,!2, 


ie. U is unitary. Thus, every unitary operator can be defined with 
the aid of a transformation of elements of one closed ortho- 
normal system into the elements of another such system. 

Let A be a linear operator and y = Az. Let U be a unitary operator, 
and y’ == Uy, x’ = Ux. Since y = Az, we can express y’ in terms 
of x’ in accordance with 


y’ =(UAU) 2’, (136) 


the operator B = UAU~- being called the unitary equivalent of A. 
It follows from this formula that A = U-1 BU, whence it is clear 
that, if B is the unitary equivalent of A, A is the unitary equivalent 
of B. If P is the projector into subspace Lp, UPU~-? is evidently 
the projector into the subspace obtained by applying U to sub- 
space Lp. If x, is an eigenelement of A corresponding to the eigen- 
value Aj, i.e. Avy = Ay, we obviously have, on writing xj = U2,: 
(UAU-}) 4, = A, 2%, i.e. unitary equivalents have the same eigen- 
values, whilst the eigenelements are connected by the unitary trans- 
formation concerned. It can easily be shown, by using (129), that if 
A is a self-conjugate operator, B is also self-conjugate. 

THEOREM 2. The eigenvalues of a unitary operator have unit modulus, 
whilst the eigenelzments corresponding to different values are mutually 
orthogonal. 
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Let U be a unitary operator and z,an eigenelement of it, correspond- 
ing to the eigenvalue A), ie. Uxy = Ay Xo. Since U does not change 
the norm, we can write 


(po) X) = (UX), UXq) = (Ag Xp, Ag Ly) = = [Ao/? (Xp, Lo), 


ie. |] % |] =] Ag ||| 2 |], whence it follows, since {| z || #0, that 
|A,| = 1. Let 2 and x, be eigenelements corresponding to distinct 
eigenvalues A, and 4d,, ie. Ux, = Aya and Ux, =A, 2,. Since U 
does not change the scalar product, we can write 


(Xp, %) = (UXy, Ux) = (Ay Lo, A, Zy) = Ay Ay (20 24): 


Suppose (2, 2) # 0; then it follows from this last equation that 
A, A, = 1. But, by what has been proved, |4,| = 1, so that 2, = 
== 1A) = Ag, ie. Ay = Aq, which is absurd, since A, and A, are distinct 
by hypothesis. Let us now introduce a further class of operatcrs. 

DEFINITION. A linear operator V is said to be isometric, if it does 
not change the norm of an element, i.e. || Vax || = || @ || for 2 € H. 

Like every linear operator, V is defined in the whole of H, but 
it is not required that V transform H into the whole of H; in fact 
an isometric operator need not be unitary. To give an example, Jet 

» (k = 1,2, ...) be a closed orthonormal system in H as above, 
so that every element 2 is expressible by its Fourier series (134). 
We define V by the formula 


Vi= da, Chay: (137) 
k=l 
Obviously, V is a linear operator, and 


\|Palf? = jal}? = > |a,!. 


It follows from (137) that V maps H one-to-one onto the sub- 
space of elements orthogonal to 2. 


138. The absolute norm of an operator. We now introduce a new concept 
in regard to the norm of a linear operator. Let A be a linear operator, and 
Xr, Yp (ke = 1, 2, ...) any two given closed orthonormal systems in H. We form 
the sum of non-negative terms: 


oo 


a ( (Azp, Yq) (Yq, AXp) = PA (Aap, Y¥q)l?. (138) 
P,q=1 p,q=1 
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Let N(A; xp, y,) denote the positive value of the square root of this sum. 
It may be equal to (+0). Let us show that it is independent of the choice of 
orthogonal systems x, and y,. Since (Axp, y,) are the Fourier coefficients of 
the element Ax, with respect to system y,, we can write instead of (138), by 
virtue of the closure equation: 


N? (A; 2p, Yq) = a ||Aap||?. (139) 
p= 
On the other hand, since (Axp, y,) = (%p, A* Y,), wo obtain 
N? (A; ap, Yq) = N?(A*; Yq. Xp) = 2 ||A* ygll?- (140) 
q= 


It follows from (139) that N#(A; xp, y,) does not depend on the choice of 
system y,, and it follows from (140) that it does not depend on the choice of 
system zp; thus it is natural for us to write N*(A) instead of NA; Lp, Yq): 
The positive number N(A) will be termed the absolute norm of the operator 
A. This norm may be equal to (+c). On taking (139) and (140) into account, 
and the independence of N(A) on the choice of systems x, and y,, we get 

N(A) = N(A*). (141) 


Further, the equation 
NAS 2) =D ||Aeat Baal 
p= 


and inequality (107) of [59] yield at once: 


N(A + B) < N(A) + M(B). (142) 

Let U be a unitary operator. Now, U~!x, forms a closed orthonormal 
system and || UAz || = || Az ||. Hence it follows, by (139), that 

N(UAU™?) = N(A), (143) 


i.e. unitary equivalents have the same absolute norm. Let N(A) be finite and 
let x be a given normalized element. We can take it as the first element in an 
orthonormal system, in which case we have from (139): N#(A) >|| Az ||?, ie. 


||Az|| < N(A) for ||2|| = 1, 


whence it follows that the ordinary norm of an operator is < its absolute 
norm, 

Txueroreo. If the absolute norm of an operator A is finite, A 13 completely 
continuous; tf, in addition, A 1s self-conjugate, we have 


N?(A) = D Ak, (144) 


where A, are the eigenvalues of A (multiple values appear several times ). 
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Let U be any given bounded set. We have to show that, if N(A) < +0, 
the set Az, where x € U, is compact. By hypothesis, there exists a positive 
number J such that || x || < lif « ¢ U. The set AU is evidently bounded, since 
|| Az || < n,l. On introducing some closed orthonormal system y; (k = 1, 2,...), 
we can transform H into /,, and it remains for us to show that, given any e > 0, 
there exists a positive integer n, such that 


a» (Ax, yy)? < &. (146) 
k=ng 
We have 
> (Ary )2?= DS (x, At ys <P YX ||A* yll*- 
=MNe =Me k=Ng 


But, since N(A) < +, series (140) is convergent and there exists an 7, 
(independent of the choice of « € U) such that 


eo 


2 
x At yl <p 
k =Nneg 

whence (145) follows, and the complete continuity of A is proved. If A is self- 
conjugate, we choose for y; the closed orthonormal system of its eigenelements, 
so that Ay, = A, y, and || Ay ||? = || A*® yy ||? = az. Now, by (140), we obtain 
(144), and the fact that N(A) is finite is equivalent to the convergence of the 
series on the right-hand side of (144). 

We shall prove later that, if A is a self-conjugate positive operator, i.e. 
(Az, x) > 0 for x € H, there exists a linear positive operator B such that 
B? = A. We usually write B = A. We can use this operator to introduce 
the concept of the trace of a linear positive completely continuous self-conjugate 
operator: 


N?(B) =  ||Bz,|? = D (Bay, Bap) = D' (Bap, 2p) = DB (Aap, ap). 
p=1 p=1 p=1 p=1 


It will be seen from this that, in the case of a self-conjugate positive operator, 
the sum 


a (Azp, Lp) 
p=1 


is independent of the choice of system x). This sum is called the trace of operator 
A and is written symbolically as Sp(A). It follows from the above discussion 


that 
SpA) = NVA) = J (At 2p). 
p= 


If A has a purely point spectrum and we take as x, the closed ortho- 
normal system of the eigenelements of A, we obtain, since Ary = Up Xp: 


8 


Sp(A) = H Hp: 
p=l 


I 
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139. Operations on subspaces. This section and the next will be 
devoted to operations on subspaces and the properties of projection 
operators. This material will be required later for the theory of self- 
conjugate operators. 

Let L, (k= 1,2, ...,m) be mutually orthogonal subspaces. We 
bring in the concept of their sum [cf. 122]: 


| Ope ee ao eco Ee (146) 
We shall write Z for the set of elements x of the form 
Beit Wyt..-+ ly (147): 


where x; € L,. Since the L, are orthogonal, we have x, = P,, 2 (k= 
= 1,2,...,m), and 
fell? = [lea]? + [tel]? +--+ [ll - 


It may easily be shown that Z is a subspace. It is known as the 
orthogonal sum of subspaces L,. Let us now consider an infinite sum 
of mutually orthogonal subspaces. 


b=h @1,01,@..: (148) 


Let Z denote the set of elements x expressible as the sum of a 
convergent series 


Me 2, We 4 Wg ey (149) 
where 2; € L,. The last equation is equivalent to [122]: 
[lel]? = |leall? + [fatal]? + [legll? + +> (150) 


and when it is satisfied, x, = P,, x. If x is any element of H, x= 
= Piz and 8m(x) = % + % +... + 2m, then 


lo — Sy (2) |]? = [lal |2 — 2 ee (151) 


The equation (150) is equivalent to || 2 — 8m(x) ||» 0 as m— oo 
It may easily be seen that L is a lineal. Let us show that LZ is a sub- 
space. Let 2” € LZ and a= as n-» oo. We have to show that 
x € L also. We have the obvious inequality 


az — 8 (2)|| < [|v — 2 |] + [lz — 5, (2) + |] 8m (2 — a). 
But (2 — x) is the projection of «” — x on to the subspace 
L,@®L,@ ... © Lm, 80 that || Sma” — x) I] < || 2 — a ||, and we 


can write 
[2 — 8, (w)|| < 2 {jw — 2 |] + |[x — 8, (@)]]. (152) 
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Given « > 0, we can fix an n such that || 2 — 2” || < ¢/3. But 
|| a — s(x) || < e/8 for all sufficiently large m, since a” € L, 
and it follows from (152) that || * — s,,(z) || < ¢, ie. 2 € L, and we 
have proved that Z is a subspace. Given any y € H, the element P; y 
is expressible as 


Pry=ztet...,; (153) 


where z, = P,,(P_y). But, since the L; belong to L, we have P,(P1,y) = 
= Pi,y, ie. Py PL, = P.,, and on passing to conjugate operators, 
P,, Pi = P,,, whence 2, = Pry, and (153) can be rewritten as 


| Pry=PrytPiytPuyt--. (154) 
1.e, 
PieP EP GP A204, (155) 


where the convergence of the series must be understood in the sense 
of the strong convergence of a sequence of operators. 

Notice that, if 2, %,,... are mutually orthogonal and normalized 
elements, and it is assumed that each xz, generates a one-dimensional 
subspace L, of elements ax,, where a@ is any complex number, we 
have an orthogonal sum of these subspaces, which is formed by 
elements of the form 

> Xs 
Kk 


where the series of numbers | c;, |? is convergent, and the projector 
into subspace LZ has the form 


Pry = UU: 
k 


where a, 2%, = P.,y and ay = (Yy, 2x). 

A subspace M is said to be part of subspace L (M c L) if all the 
elements of M belong to L. The difference between L and M:L © M, 
is defined as the set of elements of Z orthogonal to M [122]. If we 
write LQ M = M,, then L= M @ M,, and subspaces M and MM, 
are complementary to each other with respect to L [122]. 

The product L, L, of two subspaces is the set of elements common 
to L, and L,. It is easily shown that this set is a subspace. This defini- 
tion of product is applicable to any finite or infinite number of 


subspaces. 
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140, Projection operators. We have seen that the projection 
operator P,; into a subspace JL is self-conjugate and has unit norm 
{excluding the case when P, is the annihilation operator [124)}). 
It follows at once from the definition that 

2 P (156) 
so that 
(P, 2%, &) = (P)2,x) = (P,@,P, 2) = ||P, 2|? > 9, 
i.e. Py is a positive operator. Let us prove some theorems on pro- 


jectors. 
THeorEM 1. If A is a self-conjugate operator, satisfying 


A? = A, (157) 


A is the projector P, into the subspace L formed by the elements y = Ax 
when x runs over all H. 

The set L of elements y = Az, when z runs over H, is a lineal, 
since the operator A is distributive. Let us show that L is a subspace. 
Let y, be a sequence of elements of LZ and y,=>y. We have to show 
that y € L. Since y, € L, we can say that there exist elements 2, 
such that y, = Aa,, or, by (157), yz = A(Agp), i.e. Yn = Ayn, whence, 
by passing to the limit and using the continuity of operator A, we 
get y = Ay, so that y € L. It remains for us to show, in order to 
complete the proof, that the element ( — Az) is orthogonal to any 
element of ZL, i.e. is orthogonal to an element Az, where z is any 
element of H. We have 


(x — Ax, Az) = (a, Az) — (Az, Az). 


Since A is self-conjugate, we can change A over from the first 
element 2 to the second element z. We thus get 


(x — Ax, Az) = (x, Az) — (x, A?2z) 


and it follows from (157) that the right-hand side is zero, i.e. (x — 
— Az, Az) = 0; the theorem is proved. 
Two projectors P;, and Py are said to be mutually orthogonal if 


where the symbol 0 on the right indicates the annihilation operator. 


On passing to conjugate operators in (158) and recalling that the 
projector is self-conjugate, we obtain, along with (158): 
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THEOREM 2. The necessary and sufficient condition for projectors P,, 
and Py to be mutually orthogonal is that the subspaces L and M be 
mutually orthogonal. 

Necessity. lf L and M were not mutually orthogonal, there would 
exist an element 2, of M, not orthogonal to Z. We should have 
Pyx = « for such an element, so that P,(Pyx) = Pix 4 0, which 
contradicts (158). Let us prove the sufficiency. If D1 M, Pyzx is 
orthogonal to Z for any element z, so that P,(Pmz) = 0, 1.e. (158) 
holds. 

THEOREM 3. The necessary and sufficient condition for the sum 
P,-+- Py to be a projector is that subspaces L and M be mutually 
orthogonal. If this condition is fulfilled, P, + Py is the projector into 
LEM. 

Necessity. Let P, + Py be a projector. We now have, by (156): 


(Pp+ Py) (PL. + Pw) =PLt+ Pm (160) 


and, on removing the brackets and recalling that Pp = P, and 
Pi, = Pm, we get 


We multiply on the left by P,: 
P,PytPiPuPt=O0. (162) 


On multiplying this equation on the right by P,, we get PL.PyP.t = 
= 0, which leads us, by (162), to Pp Py = 0, from which it follows, 
by Theorem 1, that Z and M are mutually orthogonal. Let us prove 
the sufficiency. If Z and M are mutually orthogonal, we have (161), 
by virtue of (158) and (159), and hence (Py, + Py)? = Pi + Pm, 
and by Theorem 1, P, + Py is a projector. The subspace correspond- 
ing to this projector is defined by 


y=(Pp+Py)e= Pl r+ Pt, (163) 


where x runs over H. Here, P;r¢€ L and Pyx € M. Hence any 
element y, defined by (163), belongs to L @ M. Conversely, if we 
take any element u + v belonging toL @ M, whereu € Landve M, 
substitution of z= u+ v in (163) gives us y= u + v. Thus (163) 
defines the subspace ZL @ M, and the theorem is proved. 

The operator Py is said to be part of operator P, if 


P, Py = Py. (164) 
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On passing to conjugate operators in this expression, we get 


THEOREM 4. The necessary and sufficient condition for Py to be 
part of P, is that the subspace M be part of subspace L. This is equivalent 
to the condition that, for any x, 


Pa el) < [[Pr , (166) 
or what amounts to the same thing, 
Pi oP (167) 


If condition (164) is satisfied and we take an element 2, belonging 
to M, we have Py x, = 2p, and it follows from (164) that P, x) = 2p, 
ie. 2) € L and M is part of L. Conversely, if M is part of L, gi- 
ven any 2, the element Py x belongs to M, and hence to L, so that 
P,(Py xz) = Pmt, i.e. condition (164) is fulfilled. Now, by (165), we 
can write for any element a: || Py x || = Py(PL2) || < || Piz ||, 
whence inequality (166) follows. Let us now show that, conversely, 
(166) implies that Mf is part of L. If this were not the case, an element 
x, would exist, belonging to M and not to L. We should have for 
this element: || Py 2 || = || 2 || and || Pp xy || < || 2% ||, which 
contradicts (166). Finally, by (157), we can write (166) as (P,a, 7) > 
> (Pya, x) or ((P, — Py) 2, x) > 0, whence it follows that (167) is 
equivalent to (166), and the proof is complete. 

THEOREM 5. The necessary and sufficient condition for the difference 
P, — Py to be a projector is that M be part of L. If this condition is 
fulfilled, P_ — Py is the projector into LO M. 

If P, — Py is a projector, we must have 


(Pp — Pm) (Pi — Pu) = Pi — Pu (168) 
or, on removing the brackets, 
Pi, Py t+ Pu Py, = 2P xy. (169) 


On multiplying by P,, first from the left, then from the right, 
we arrive at the two equations 


P, Py + P, Py P= 2P,. Py and Py PyP,+PyP,=2PuPp 


from which it follows that P, Py = Py Pr, and by (169), we have 
Py Py = Py Py. = Py, i-. condition (164) is fulfilled, and Af is 
part of ZL. Conversely, if M is part of LZ, i.e. (164) and (165) are satis- 
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fied, (169) follows from them, and hence (168), so that, by Theorem 1, 
P, — Py isa projector. The subspace corresponding to it is defined by 


where x runs over all H. The elements P,2 and Pyz belong to L, 
since M is part of L by hypothesis. Formula (170) thus yields elements 
belonging to LZ. Let us show that, in addition, the elements y are 
orthogonal to M. Let z be any element of M7. We have Pyz =z, 
and we can write 


(Pi. x— Py 2,2) = (P, 2 —Py@, Py 2). 


On transferring Py, from the right to the left and using condition 
(165), we obtain 


(P, 2 — Py, 2) = (Pye — Py x,2) = 9, 


i.e. in fact Phx — Pyx 1 M. Hence (170) yields elements y belonging 
to L — M. If u is any element of LO M,ie. u€ Landul M, 
then y = Phu — Pyu = Phu=u, and we can therefore finally 
assert that (170) defines the subspace ZL © M, and the theorem is 
proved. 

THEOREM 6. The necessary and sufficient condition for the product 
P, Py to be a projector is that P, and Py commute, i.e. 


If this condition is fulfilled, P, Py is a projector onto the sub- 
space LM. 

The necessity of (171) follows from the fact that (171) is necessary 
and sufficient for P, Py to be self-conjugate. Let us now show that, 
given (171), the operator P, Py satisfies (157): 


(P, Py) (Pi Pm) = Pi Pu = PLP. 


The first part of the theorem is therefore proved. If x is any element 
of H, the element 


y = (P_ Py) t= P, (Put) = Py (Pie), (172) 


obviously belongs both to Z and M, i.e. belongs to LM. Conversely, 
if we take any element x, of LM, (172) with x = zy gives us y = Zp. 
Thus (172) defines a subspace LM, and the proof is complete. 

THEOREM 7. The limit of a convergent sequence of projectors is a 
projector. 
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We have P,,— P, the P, being projectors and P a self-conjugate 
operator [131]. On passing to the limit in P? = P,, we get P? = P, 
from which it follows, by Theorem 1, that P is a projector. 

THEOREM 8. Every monotonic sequence of projectors has a limit. 

We first consider a non-decreasing sequence of projectors: 


| ae ie are Se ae Sere (173) 


To prove that sequence (173) has a limit, we have to show that 
P,« has a limit for any choice of z, i.e. given any positive «, there 
must exist an NM such that 


|P,7@—P,,2||<« forn>m> WN. (174) 
By (173) and Theorem 4, 
P12] < ||P2all < ||Ps al] <---, 


where we have for any n: || P,x|| < || a||. The non-decreasing 
sequence of non-negative numbers || P,z || therefore has a limit, and, 
given any < > 0, there exists an NV such that 


||P, 2|/? — ||P, a? < e? for n>m> N, 
By (157), we can write this inequality as 
(P,P ais) <e? forn>m>QN. 


Since P,, is part of P,, P, — Pm is a projector, and by (157), the 
last inequality leads to (174); the theorem is proved. Notice that, 
by Theorem 7, the limiting operator P of sequence (173) is a projector, 
and on passing to the limit as n > ~ in the inequality ((P,r — 
— Pm) 2,2) > 0, we get ((P — Pm) 2,2) > 0, ie. P > P,. It can be 
shown similarly that a decreasing sequence of projectors has a limit, 
and this limit is also a projector. 

THEOREM 9. If L;,(k = 1, 2, ...) is adenumerable number of mutually 
orthogonal subspaces, the sum 


een (175) 
k 


is a projector into the subspace 
b= Li@in@s4x (176) 


This proposition is a direct consequence of what was said in [139]. 
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141. The resolution of the identity. The Stieltjes integral. The sub- 
sequent development of the theory of self-conjugate operators is based 
on a general expression for any self-conjugate operator. Before deducing 
this expression, an important new concept must be introduced. 

DEFINITION. A resolution of the identity is defined as a family of pro- 
jectors @,, depending on a real parameter A and satisfying the following 
conditions: (1) a projector 2, does not decrease as A increases, i.e. t} 
> A, then &, > &; (2) there exist finite values =a and A=b 
such that ®, = 0 and @, = E; (8) the projector &, is continuous from 
the right with respect to the parameter A, i.e. 

lim ,=&,). (177) 
ax 40 

Notice that, by Theorem 7 of [140], given any value 4’, &, has 
a limit as A tends to 4’ both from the left and the right. These limits 
are projectors, which are naturally denoted by @y_) and F445. 
By (177), we must have %,,, = @,. We shall say that %, is con- 
tinuous at the point 4 if , = @,_». Condition (177) requires that 
the projector @, be continuous from the right at every point. This 
condition is only added so as to fix the value of #, at every point of 
discontinuity with respect to A. 

Let us note some properties of the resolution of the identity %,. 
Let m be the strict upper bound of all the 4 for which %, = 0, ice. 


&,=0 for A<m and 8,>0 for A>™m. (178) 


At the point A = m itself, the projector 2, will differ from the 
zero operator if #, has a jump at this point. Let Mf be the strict lower 
bound of the A for which , = E. Since %, is continuous from the 
right, we must have y= H; thus M is defined by the following 
conditions: 

@,<E for a4<M and 6,=E for A> WM. (179) 


If ce, is any fixed positive number, we can say that the projector 
&, varies from 0 to E as A varies in the interval [m — é 9, MJ. Further, 
by Theorem 5, we can say that, when p > A, &, — &, is a projector, 
and 


6,6,=6,6,=6,. (180) 
(u > A) 


If we let A tend to yw from the left in the difference 7, — @,, it 
will be seen that %, — %,,.9 is a projector. It can similarly be shown 
that %, — @,_, is a projector when v > yu. The following notation 
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will often be used in future. Let A be an interval [a, 8]. We write 
AE, = &,— &.. (181) 
If 4’ and A” are two intervals having no common interior points, 
we have by (180): 
A’S,-A" %,=0 (182) 
(A’ and A” have no common interior points). 


Using Theorem 2 of [140], we can say that the last equation is 
equivalent to the following: given any elements x and y, we have 


A’'@,2 \ A" &,y (x and y arbitrariy) (183) 

(A’ and A” have no common interior points). 
If A, is the common part of intervals A’ and 4”, we have by (180): 
A 6,-A F, =A, é@,. (184) 
We know how to add operators and pass to the limit in an operator 
sequence. This gives us the possibility of using the resolution of the 
identity #, to form a “Stieltjes integral” for any continuous function. 
Let f(A) be a given continuous function, which may be complex, 


in an interval [m — e), Mj, where ¢, is a fixed positive number. 
We subdivide the interval: 


m—e@m =A <A, <A <<... <A, <4,=H, (185) 


and form the ‘‘Riemann-Stieltjes sum’’ corresponding to this sub- 
division 6 of the interval: 


0, = 2 f(r%) 4, 2, = 2 fl) (F4,—Fx_); (186) 


where v, is any value from the interval [A,_,, 4,]. The sum o; is a 
linear operator. Let ns; denote the greatest of the differences 4, — 
— A,-,. The following fundamental theorem holds: 

THEOREM. Given any sequence of subdivisions 6, with the condition 
that 13, 0, the sequence of operators a5, has a definite limit, in the 
sense of strong convergence of the operators. 

We must first prove two lemmas. 

Lemma 1. If a and a, (k = 1, 2,...,n) are complex numbers and 
L=X, +t +... + 4p, the elements x, being mutually orthogonal, 
we have 


lax — Bay a, | < 9 ||2[), (187) 


where 5 is the greatest of the numbers | a — a, |. 
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We can write 


n it 
Qt — 4% Oy y= G& (4 — A) ty, 
k=l k= 


whence, by Pythagoras’ theorem, 


nm nN n 
ax — Sa, 24 |? = Sa — ay ay [2 < d* Silme |. (188) 


=! 


Pythagoras’ theorem also gives us 
elle = 3 ae IP 
k=l 


and (188) now leads directly to (187); the lemma is proved. 
Lemma 2. If 6 ts a subdivision (185) of the interval [m — ey, M], 
and 5’ is some other subdivision: 


M—& =A <A <A... <M < hy HUM 
of the same interval, we have for any element zx: 
|| 3% — oy @}| < 2) 21), (189) 
where w is the greatest oscillation of the function f(A) in the intervals 
[An—a, An] and [Aj_,, Ai], ie. w@ is the number such that 
[f(a) — f(8)| <@, (190) 


tf a and B belong to the same interval [Ay, Ax] or to the same interval 
[Aka Ak]. 

We form the product 66’ of the subdivisions. On passing from 
subdivision 6 to subdivision 6’, each sub-interval A; of is split into 
a finite number of sub-intervals 4) (s = 1, 2, ..., m,). Each term 
f(r) 4p 8, & of the sum 


oe = 2 f(r) ApF x (191) 


is now replaced by the sum 
m 


ik 
& fP) APF, 2, 


s=1 


where > is a value from the sub-interval Af’. Hence » and rf?) 
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belong to the same sub-interval 4, of subdivision 6, and we have, 
by (190): 
f(r) — f%2) |< (192) 
(Se D2 Soe IE) 


We form the difference 
n mk 
Oy — Cy gt = = [f() 4, Be — = foe) Ap & x] . 


By (183), the elements 4; &, 2, A® $x are mutually orthogonal 
for different &, and we can write, using Pythagoras’ theorem: 


|| 05 © — Syy & ||? = S | (4) Ay FB — = fA) ADF, aj|?. (193) 
We have, in addition, 4,%,2 = Ss A® €, x, and the elements 


s=1 


A® $, x (s = 1,2, ..., mx) are also orthogonal to each other. Using 
Lemma 1 and (192), we obtain 


\| £(>,,) ) A,B ,2— Sf?) AD Fa || < || A, F, =| 
so that, by (193), 
|| 5% — Ogg ||? < < ot SI) 4,842. (194) 


Since @m-., = 0 and @y = H, we can write 
as 
t= SA,F,2, (195) 


k=l 


where the elements on the right are mutually orthogonal. Pythagoras’ 


theorem gives 
n 


zi? = 4c Faz IP, (196) 
and inequality (194) can be rewritten as 
lL og — Oy @|| << |] al]. 
It can similarly be shown that 
|] Oy © — Oy @\] << w || a], 
and the statement of the lemma follows at once from 


|| og — oy 2|| < | oyu — ogy BI] + |] oy B — oyy Z| . 
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We now turn to the proof of the theorem. We have to show that, 
given any element 2, the sequence of elements o; 7 has a limit, i.e. 
o;,% => y. Once this is proved, the limiting element y may easily 
be seen to be independent of the choice of sequence 6,. For, if 6, 
and 6), are two sequences of subdivisions, satisfying the condition 
indicated in the theorem, and if o; > y and o’ => y’, then the 
sequence of subdivisions 46,, 51, 6,, 63, ... also satisfies the condition 
of the theorem, so that the sequence of elements 05,2, 03,2, 05,2, 
63,0, ... must also have a limit. It follows at once from this that 
y =y. 

Let us establish a preliminary inequality. 

The elements 4, %,2, appearing in the sum o,, are orthogonal to 
each other, and by Pythagoras’ theorem: 


o 
- 


n 
oa? = |e) PAB a2 |? (197) 
Further, the continuous function f(A) is bounded in modulus, i.e. 
| f(A) | < p, where p is a positive number. Formula (197) leads to 
the inequality 


Ilo ||? <p? || nF, 2 i? (197) 


from which it follows, by (196), that || os7 || < p || ||, ie. the norm 
of the operator o; does not exceed p for any subdivision. Let us 
now show that the sequence of elements o,% has a limit for any 
choice of x. In view of the condition of the theorem regarding the 
uniform continuity of /(A) in [m — e), M], given any « > 0, there 
exists an N such that 


| f(a’) — f(a’) | < é, 
if A’ and 4” belong to the same sub-interval of subdivision 6, when 
n > N. On applying Lemma 2, we can say that 
|| o,,%@ — 03, € || < 2e||x|| for n and m> JN, 


i.e. the sequence 6, x is mutually convergent, i.e. tends to a limiting 
element; the proof is complete. It is natural to use the ordinary 
notation for the Stieltjes integral to denote the limit of the operator 
sequences on indefinite subdivision (in the sense of a strong convergence 
of operators): 


n M 
lim fl) 4c, = J fade, (198) 
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The following notation is used for the limiting element of the 
sequence of elements (191) on indefinite subdivision: 


n M 
lim > f(r,) 4,8,2== § fAd&, zx. (199) 
k=1 m—£9 
It can similarly be shown that the corresponding integrals 


B 
f(a) d&,, and § f(a) d&, x (200) 


PU 


exist, over any part [a, 8] of the interval [m — e,, M]. Notice that 
the operator &, and element @, x are constant outside [m — eo, I]; 
in view of this, the integral over the finite interval [m — eo, Jf] is 
often written as an integral over an infinite interval: 


M too M 
§ faa’, = J fas; J (Ade ,x= a f(a) dB 2. (201) 
M-€q Seated —£q 

By separating out the jump (if there is one) %,, of the operator 
%, at the point A = m, the above integrals can be reduced to integrals 
over the interval [m, M]: 


M 


i f(A) a&, = fl (m)Em-+ J f(A) d&,; 


iy . (202) 
§ fA) dB, x = flm)Fnx+ J f(2) dbz. 


An elementary bound exists for the integral, analogous to (197,): 
if | f(A) | < p, in the interval [a, 8], then 


B 
| J 4) ae, 2] < ry || Fp—B.) 2). (203) 


We shall in future simply write m for the lower limit, instead of 
(m — €,). The integral thus written, over the interval [m, Mf], will 
be equivalent to the integral over the original interval, with the 
addition. of {(m) Z, or f(m) & 


142, The spectral function of a self-conjugate operator. If f(A) has 
real values, the operator o; is a linear combination of projectors 
with real coefficients, i.e. is a self-conjugate operator, and the limit 
of o; on indefinite subdivision is also a self-conjugate operator. 
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On putting f(A) = A, we get a self-conjugate operator A: 


M 
A= f{ A4d%, (204) 
or 
M 
Aa = § Ad& 42. (205) 


Formula (204) is fundamental to the entire theory of self-conjugate 
operators. We have arrived at it by starting from a resolution of 
the indentity 2. For every &,, there is a corresponding self-conju- 
gate operator A, given by (204). The converse also holds. 

THEOREM. Given any self-conjugate operator A, there exists a resolu- 
tion of the identity 7, such that A is expressed by (204). 

The proof of this theorem is fairly complicated and, to avoid a 
break in the exposition, we shall postpone it till the end of the present 
section. We shall later prove a formula, in accordance with which 
@, can be defined for a given self-conjugate operator A. It will follow 
from this formula that different operators A correspond to different 
resolutions of the indentity. By the above theorem, (204) represents 
the general form of a bounded self-conjugate operator. If sum (191), 
with f(A) = A, is multiplied by an element y, followed by a passage 
to the limit, an expression is obtained for the scalar product (Az, y) 
as a Stieltjes integral: 

M 


(Az, y) = J Ad(F,, 2, y) - (206) 


It may be recalled that, if 7,, differs from zero, the right-hand 
side is to be understood as the sum: 


M 
mE mt, Y) + H) Ad(F, x,y), (207) 


where the last integral is an ordinary Stieltjes integral. We could 
have taken (206) as fundamental, instead of (204), since the operator 
A is completely defined by specifying a bilinear functional. Remember 
that the scalar product (@, 2, y) is expressed linearly in terms of four 
scalar products of the form (@,2z, 2) = || & 2 ||* [125]. 

Since J, > %, when u >A, || %,z||\* does not decrease as A 
increases, so that the (in general complex) function (#, 2, y) under 
the sign of the differential is a function of bounded variation in 4. 
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If we put y = 2, we get an expression for a quadratic functional as a 


Stieltjes integral: 
M 


(Ax,x) = | Ad(®,2, 2). (208) 


Here, the increasing function (#, 2, x) = || &, x ||? stands behind 
the differential sign. 

The family of projectors , is usually known as the spectral function 
of the self-conjugate operator A defined by (204). Let us show that 
the numbers m and M defined above coincide with the bounds of 
the operator A that we defined in [126]. We write down the quadratic 
functional (Az, z) in the form 


(Az, 2) =m|Fnr\"+ J Ad || F,,2|[?. 


The function behind the differential sign is non-decreasing in A. 
On replacing A first by m, then by M, we arrive at the inequalities 


m||F yx ||? < (Az, 2) <m]|é,2/? + MU Fy 2\? —||Fn 217); 


or, since Sy «= 2, at the inequalities 
m || x||? < (Az,xz) < M||a|l?. 


It remains to show that m and M&M are the strict bounds of (Az, 2) 
when || 2 || = 1. Let us show e.g. that M is the strict upper bound. 
The difference y — @y_-, = E — Fy-,., where « is a given positive 
number, is a projector differing from the zero operator. Suppose 
that the normalized element x belongs to the subspace corresponding 
to this projector. Then (2 — @y_,.) c= 2, ie. Fy_xe = 0, and all 
the more 2,2 = 0 for 4 < M — «. Hence, on replacing the factor 
A by (M — «) in (208) and taking || 2 || = 1, we can write 

(Az, x) - (M - é) ! (£ aa F us) a ||? = (At — é) ’ 
whence it follows, since ¢ is arbitrary, that M is the strict upper 
bound of (Az, 2) when || a || = 1. 


Another formula may be deduced. We multiply both sides of the 
equation 


Oy = 4A Bn = MT Ay — Ba,_,) (209) 


by 2,, with the assumption that / is one of the points of subdivision A,. 
Now, by (180), we have 3, - 4,0, = 4%, + &,=0 for A < A, and 
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G,° A, 8, = 1,6, °F, = A, S, for A > dy, 80 that 
G40 = 9,8, = SA g,, 
ARKA 
and a passage to the limit gives us 
4 
$,A= AG, = J Ide, (A> m) (210) 
m 
together with the analogous formula for a bilinear functional: 


A 
(, Ar, y) = | Ad(F, x,y). (211) 


143, Continuous functions of a self-conjugate operator. If A is a 
self-conjugate operator defined by (204), we can associate with any 
function /(A), continuous in the interval [m, M/], an operator f(A), 
defined by 


M 
{(A) = J f(A) ae, . (212) 


This correspondence between a continuous function f(A) and a con- 
tinuous operator f(A) is distributive, i.e. the operator c, f(A) + cof,(A) 
corresponds to the continuous function ¢,/,(A) + ¢,/,(A). This is an 
immediate consequence of the fact that integral (212) is distributive 
with respect to f(A). Moreover, the correspondence is multiplicative, 
i.e. the operator /,(A) f,(A) (or the operator equal to it, 7,(A) /,(A)) 
corresponds to the function /,(A) 7,(4). To prove this, we form the 
product of sums o; for /,(A) and f,(A): 


“a n 
= falx) A, Ba a fal) A,@,,; 
Using (182) and (184), we can write the above product as 
"1 a ie n 
fale) A, &;° & fal?) A, b= 2 filer) fal) An Fr » (213) 
and passage to the limit gives us 


M M M 
J flAaey- f fx) AB, = JS lA) fal) dB, (214) 
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which is what we set out to prove. Formulae analogous to (212) can 
be written for a bilinear and quadratic functional: 


M M 
(f(A) 2, y) = [f(A d(Fyax.y); (F(A) a 2)= ff(AaF,2I2. (218) 
Further, we have the formula analogous to (211): 
A 
%, f(A) =f(A)®, = SF Ade. (216) 


On taking (214) into account, we get the following formula for 
positive integral powers of A: 


M 
An = [ Ande, (217) 
m 


(n= 1,2,3,...) 
and for a polynomial: 


aA" +a,A%14...4+4,,A+a,= 
M 
= [ (aga" + a, Art + ee Oy A + Gy) dF). (218) 
m 


As mentioned above, if f(A) is a real function, the operator az is 
self-conjugate, and the limit of g,, i.e. f(A), is also a self-conjugate 
operator. If f(A) > 0 in the interval [m, M], by (215), the operator 
f(A) is positive. Now suppose that f(A) is complex: f(A) = y(A) + 
+ (a) i. We now have f(A) = y(A) + 7y(A), where g(A) and y(A) 
are self-conjugate operators. On forming the operator F(A) = (A) — 
— ty(A), we can use the fact that g(A) and (A) are self-conjugate 
to write 


i.e. the operator F(A) is the conjugate to f(A). 

Some commutation properties should be noticed. It follows from 
(210) that the operator 2, commutes with A for any value of 4. 
Hence the operator 4 2, = 8; — &, also commutes with A, for any 
values of a and f. Thus the sum o; commutes with A, and we find 
on passing to the limit that f(A) commutes with A. Let us now prove 
a theorem. 

THEOREM 1. The operator f(A) commutes with any operator B that 
commutes with A. 
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Let e, be a sequence of positive numbers tending to zero. By Weier- 
strass’s theorem [IJ]; 154], there exists a sequence of polynomials 
P,,(A) such that 
[f(4) — P(A) | <& (219) 
(m<A<M). 

We form the difference 


M 
f(A) — P,(A) = § [f(A) — Py (A) dé. 


On taking (203) and (219) into account, we can write 
| [f(A) — P,(A)] @]] < & ||}, (220) 


whence P,,(A)— f(A). An operator B that commutes with 4 also 
commutes with any polynomial P,(A), ic. BP,(A) = P,(A) B. 
On passing to the limit, we get Bf(A) = f(A) B, and the theorem is 
proved. We shall show later [161] that, given any A, the spectral 
function 2, commutes with any operator B that commutes with A. 
Conversely, if B commutes with @,, B commutes with any operator 
A@,, and hence commutes with sum (209) and, in the limit, with 
the operator A. Thus we have the following theorem. 

THEOREM 2. The necessary and sufficient condition for an operator 
to commute with A is that it commute with @, for any 4. 

The following example of a function of an operator has already 
been utilized [138]. Let A be a positive operator, i.e. m > 0, and 
let f(A) = YA (A > 0), where the arithmetic value of the root is taken. 
We can define the positive operator ) A: 


ya-= { \2as, 


or 
M a 
(/Az,y)= j Vad (@, 2, y). 
By (214), we have JAVA = A. 


144, A formula for the resolvent and a characteristic of regular 
values of 4. 

The spectral function can be used to give a formula for the resolvent 
[130] and to indicate a new characteristic of regular values of A. 
We shall in future speak of the resolvent R, only with regular values 
of 1. 
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THEOREM 1. If 1 is a non-real number, or is real but outside the interval 
[m, M], the resolvent R, of the operator A is defined by 


M 
1 
R, =|. (221) 


Given the hypotheses, the function 1/(A — 1) is continuous in the 
interval [m — €,, M] for sufficiently small «,. By using (214), we 
can write 


M M 
faa, [ae = 


m 


M M M 
= [(A—)a8,- (z=; de, = [de = 8; (222) 
rt m m 
but 
M 
{Q—)de,=A-W, 
m 


and (222) leads directly to (221). 

Turorem 2. If l belongs to the interval [m, M], but lies inside some 
interval [a, 8] in which &, is constant, ie. €, = F,, the resolvent Ry 
exists and is given by (221). 

We split [m— e,, Minto three parts: [m— «, a], [a, 8] and [8, 14]. 
The function 1/(A — 1) is continuous in [m— é,,a] and [f, M], 
whilst #, is constant in [a, £], and all the 4, @, are annihilation 
operators for this last interval. We extend the function I/(4 — J) 
from the extreme to the central interval [a, 8] in such a way that 
it is continuous throughout [m — «,, M]. Let y(A) denote the function 
thus formed. The value of the integral 


=f p (A) d&, (228) 


is obviously independent of the values of g(A) in [a, B]. By using 
(214), we can write 


fot dB, - 7 (1 —1)d¥, = 


M M 
= ((@—0a8,- { 9 (a8, = ( (Ap (aB;, 
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and, on observing that (A) = 1/(A — 1) for 4 <a and A> &, and 
that #, is constant in [a, 8], we get 


M a M 
fp (A) (A—1l) dé, = faz,+ dF, =%,+(E—,) = E, 
m m B 


whence it follows that (223) yields the resolvent. We can obviously 
take integral (221) instead of (223), the integration being carried out 
from m — €, to a and from # to M. It follows from the proof of the 
theorem that a value 4 = / from the interval [m, Jf] is regular pro- 
vided it can be covered by an interval in which &, is constant. We 
show in the next theorem that this condition is necessary, and not 
merely sufficient, for regularity. 

THEOREM 3. If, given the real value 1 = 1, the resolvent R, exists, 
1 must lie inside an interval [a, B] in which @, is constant. 

Let [a, 8] be any interval containing / as an interior point, and 
A®,=&,—&,. By definition of the resolvent, 4%,2 = RA — 
— lk) A@, x. But we can write, by (216), 


B 
(A —1E) AF, 2 = { (A—1)) AB, x, 


a 


so that 
B 
A@ x = R,[{ (A—D dé, 2] 


(a<l<f). 


Let us write MV for the norm of the operator &;: 
B 
42,2] << NI] f (A—1) de, 2]. (224) 


We have |4—7| < f —a in the interval [a, £8], and, by (203), 
inequality (224) leads to 


|| 4%, 2|| < _N(B —a)|| 4,2 ||. (225) 


The interval [a, 8], containing / as an interior point, is taken so 
small that N(8 — a) < 1. It now follows at once from (225) that 
| 4g, || = 0, ie. F, = %,, and &, is constant in [a, 8]. By combin- 
ing Theorems 2 and 3, we get the corollary: 

Corotiary. The necessary and sufficient condition for a real A to be 
regular is that 4 lie inside an interval in which @, is constant. 
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It follows at once from this corollary that, if a real A is regular, 
all the real A sufficiently close to it are also regular, i.e. the regular 
values of A form an open set on the real axis; this gives us the further 
corollary: the points of the spectrum form a closed set. 

A bilinear functional for the operator #, has the formula 


M 
(Rx, y) = (>= 4 (8,29). (226) 


We can take (—©, +0) as the interval of integration in this 
integral [108], and (#, x, y) is a function of bounded variation in A. 
On setting 4 =o-+7t7 and applying the inversion formula for a 
Cauchy-Stieltjes integral [30], we get the following expression for the 
spectral function in terms of the resolvent: 


- [(F,-0%, ¥) + (Fy 40%, ))] = 


: i (Bossi re a) x, y) do. (227) 


7 lim 27% 


Top 
oO 


If %, is continuous at the point 4, the left-hand side of (227) is 
equal to (@,2, y). At points of discontinuity, (,2, y) is defined 
from its continuity from the right. Notice that the resolvent #, is 
defined in terms of the operator A itself, i.e. it follows from (227) 
that, given a self-conjugate operator A, there is only one spectral 
function in terms of which it is expressed by (204). Notice also that, 
by Theorem 1 of [143], the operators R,; commute for different J. 


145. Eigenvalues and eigenelements. The spectral function can 
be used to give a very simple definition of the eigenvalues and eigen- 
elements of a self-conjugate operator. 

THEorEeM. The necessary and sufficient condition for A = A, to be 
an eigenvalue of the self-conjugate operator A with spectral function 
&, ts that &, have A, as a point of discontinuity, ie. £1, — Fj,-9 > 0. 
In this case €,, — &3,- is the projector into the subspace M, of eigen- 
elements corresponding to the eigenvalue A,. 

Let M, be the subspace corresponding to the projector F,, — &1,-». 
If A, is a point of continuity of %,, M, consists of the zero element 
only. The proof amounts to proving the following two assertions: if 
%y € My, then (A — A, #) x, = 0, and conversely, if (A — A, E) 2) = 
= 0, then z, € M,. Suppose first that 7) € M,, i.e. (Fi, — F4,-) To = 
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= Zp. Now, all the more, 21,2) = 2, so that %,,_, 2%) = 0. Since 2, 
does not decrease as A increases, we can say that 2,2, = 2, for 
A> A, and &,2, = 0 for A < Ay. We apply (205) to the element 2,: 


M 
Ax, = JAdF,%, (228) 
m 


and we can assume when forming the sums o; that A, is a point of 
subdivision. By the foregoing, all the differences 4,2, 2, vanish with 
the exception of one, corresponding to the sub-interval with right- 
hand end Ag, i.e. 
Ady = lim » (@3, — F1,-«) Xp- 
e—>0 
where A, — € < 1%) < Ay. We obtain on passing to the limit: 
Axo = Ao (f,, = & j,-e) Xo; 


but z, € M, by hypothesis, so that the right-hand side is here equal 
to A, %, i.e. LZ satisfies the equation (A — A, E) x) = 0. Now suppose 
conversely that x, satisfies this equation, and let us show that x) € M,. 
It follows from (A — A, #) x, = 0 that 


((A — A, E)? xo, %) = 0, (229) 


or, on expressing the bilinear functional in terms of the Stieltjes 


integral: 
M 


J (A — Ap) d || F 2 ||? = 0. (230) 
m 
The integrable function (4 — 4,)? is non-negative, and the function 
behind the differential sign is a non-decreasing function of A. It follows 
from this that all the elements of integral (230) are non-negative, 
and the magnitude of this integral over any part of the interval of 
integration must also be zero. Given e > 0, we can write 


M 
f (A—A)?2d || Fy xo ||? = 0. (231) 
aste 
The integrable function (A — A,)* is > e? in the interval of integra- 
tion, and it follows all the more from (231) that 


M 
e J d\|&,x9|? =9, 
Ate 


i.e. e?[ |] 20 ||? — || Fa,+2 20 ||?] = 0. 
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Hence it follows, since ¢ is arbitrary, that %,2, = x, for A > A). 
It can similarly be shown that 2, x) = O for A < Aj. It follows at once 
from this that x) = lim (@,,4. — 24,-.) Xp = (F,, — F1,-0)%o, and 

e+0 


the theorem is therefore proved. 

If the operator A has eigenvalues, on introducing a closed ortho- 
normal system into each subspace of eigenelements corresponding 
to a fixed eigenvalue, we arrive at an orthonormal system of eigen- 
elements of the operator A [128]: 


Vas Rep Way veces (232) 


and at a sequence of corresponding eigenvalues: 


My Has Has +++ (233) 


If 7 is the rank of an eigenvalue, this latter figures r times in 
sequence (233). The number 7 may in fact be infinite. 

On writing A, (k = 1, 2, ...) for the points at which @, is discon- 
tinuous, and L, for the corresponding subspaces of eigenelements, 
we can write 


Pi= Gy — Fy-0. (234) 

We form the orthogonal sum of subspaces L;,,: 
H=L,0L,@01,@... (235) 
As we know, the projection operator into subspace H’ is given by 
Py =Piyt+ Pit Pizt--- (236) 


H’ is the subspace consisting of elements x that are expressible 
in terms of elements of the orthonormal system (232)with the aid 
of the convergent series 


L=4,%,+4,%, + O3%y +... (237) 


146, Purely point spectra. A self-conjugate operator A is said to 
have a purely point spectrum if the orthonormal system (232) is 
closed in space H (which is separable) [cf. 128]. This is equivalent 
to the fact that the subspace H’, defined by (235), is the same as H, 
or that the projector P, defined by (236) is the identity trans- 
formation, i.e. 


E= Pu. (238) 


On multiplying both sides of this formula by %, and recalling that 
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{(F;,,—F,,-0) F,=0 for <A, and is equal to 7, — &;,_ ford > A,, 
we obtain an expression for 2, in terms of the jumps of this projector: 
8, = >Pu= > (Fun — Fao). (239) 
aeSA ASA 

In the present case, any element x is expressible by series (237), 
the a, being the Fourier coefficients of x with respect to system 
(232). On applying the operator A to both sides of (237) and recalling 

that Ax, = ux X%,, we obtain 


At = Sa, Ms X,. (240) 
§ 


On forming the scalar product with y and writing 6, for the Fourier 
coefficients of the element y, i.e. 


b, a (y, a) 3 b, = (25, y) ’ 


we obtain the expression for a bilinear functional: 
(Az, y) = > M545. (241) 
s 


When y = 2, we get the formula for a quadratic functional: 


(Az, x) = oD ls | as li (242) 


which is entirely analogous to the expression for a quadratic form 
(Hermitian form) as a sum of squares. Thus, in the case of a purely 
point spectrum, very simple expressions are obtained for the operator 
A itself, and for the bilinear and quadratic functionals, with the 
aid of the orthonormal system (232). Let us now turn to the 
so-called purely continuous spectrum. 


147, A continuous simple spectrum. A self-conjugate operator A is 
said to have a purely continuous spectrum if the spectral function 
@, is continuous for all values of A. Our problem will be to obtain 
formulae for the case of a purely continuous spectrum analogous to 
the formulae of the previous section. As a preliminary, a new concept 
must be introduced. 

Let e be a set of elements of H, and x, %, ... Z, any given elements 
of H belonging to e. We form the linear combination ¢, 2, + ¢2 % + 
+... + ¢n2%,, with arbitrary coefficients c,. The set of elements of 
H which can thus be written as finite linear combinations of elements 
of e is clearly a lineal, say ZL. Let us introduce our new concept. 
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Derrmition. The closed linear envelope of the set e of elements of H 
is called the closure of the lineal L. 

The closed linear envelope is a subspace, and the characteristic of 
elements z belonging to it is as follows: given any ¢ > 0, there exists 
a finite set of elements 2, Z, ..., 2, belonging to e, and there exist 
numbers c;, such that 


|e — (¢,2%, + ¢,% +... + ¢,%,) || < e. 


In particular, all finite linear combinations of elements of e obviously 
belong to our subspace. 

Let %, be the spectral function of an operator with a purely con- 
tinuous spectrum. Notice that 2, = 0 in this case. We take a non-zero 
element x and form the set of elements 


2, (243) 


where 4 runs through all values from m to M. Let C,, denote the closed 
linear envelope of elements (243). We can form a continuous function 
of A corresponding to any given element y of H: 


Py (A) = (y, Fz). (244) 


This function is obviously distributive with respect to the sub- 
script ¥, i.e. 
Paytbz (A) = apy (A) + by, (A) : 


Let us also form the following two continuous functions of A: 
g (A) = (Fx, 4) = ||F,x\?; hy (A) = (Fy, y) = || Fay |? (245) 


As we know, these do not decrease as increases. If 4 is any interval 
[a, 8], we can introduce the usual notation for any function /(A): 


Af (A) = f (B) — f (a). (246) 
We have, for instance, 
Ag (A) = (4842, 2) = ((F, — &,) 2, 2), 
i.e. 
Ae (A) = || 48,,0|\2 (247) 


and similarly, 
Ah, (A) = || 4F,y ||?. (248) 
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We have for the function ¢y,(A): 
Ay, (A) = (y, Az) = (y, (A@,)? x) = (46,y, A®, 2) ’ 


so that 
| Ag, (A) |? < |] 48,2 ||? - || 4 y |", 


| Apy (2) 2 < Ao (2) Ah (2). 
It is clear from this that the integral exists [81]: 


1.e. 


M 
dey (A) dey (4) | ni" 
ee 249 
law [See me 

It will be shown below that, if y € C,, this integral is equal to 
lly ||?. We split the interval [m, J] into sub-intervals 4, (k = 1, 
2,...,7) and form the following elements of space H: 

AE, ; 
V Aye (A) 

If the function o(A) is constant in the interval A,, then 4, 7,2 = 0, 
and the corresponding expression (250) is meaningless. We agree to 
throw such meaningless terms out of future formulae. By (183) and 
(247), the remaining elements (250) are mutually orthogonal and norm- 
alized. The Fourier coefficients of the element y with respect to 
system (250) have the form 


(ys AnEr@) Any (Y) 
V Aye (A) V Axo (A) 


The square of the norm of the difference between y and its Fourier 
series is given by the familiar formula [121]: 


(250) 


S Aypy (A ) | Ay (A) |? 
ly Seay MeFi? =| vl? — > ae (251) 
which leads us to Bessel’s inequality: 
no | AyPy (A) P ; 
Ss ’ 252 
2 Ae <lyll (252) 
and in the limit: 
P| dey (2) [2 


ea E AONE 2, 253 
J ea <|ly li (253) 
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THEOREM 1. If y € C,, we have 


M 
| dy (A) |? 
— eae en 
lly |i = do @) (254) 


If y € C,, in view of the fact that C, is the closed linear envelope 
of #,2, given any ¢ > 0, there exist a finite set of elements & a, 


(s = 1,2, ..., p) of &, 2 and numbers c¢, such that 
P 
y= > ¢0,,0 +2 and |lz|! <e. (255) 
s=l 
We take the points 4, (s = 1, 2, ..., p) as the points of subdivision 


of the interval [m, I], adding the points A, = m and 4,4, = M, if 
these are not included among the A,, and we write Aj; for the sub- 
intervals [A;,-,,4;] (s = 1,2, ...,p+ 1) thus obtained. We have 
@, = 0 and & tp = #, and on introducing the usual notation 
A, G, = F;, = Fy -1 : 

G,,=418,; F,, = AjF,+4F,; FF, = AIF, + 43%, + AgF,; ... 


The linear combination of 2 appearing in (255) can thus be 
written as a linear combination of 4j 2, 2, and (255) can be rewritten 
as 


p+ 
y= S SAP a+2 and [z||<e 
s=l 
where 6, are new coefficients. In other words, we have 
p+i 
lly — > 0,456,2|| <e. (255,) 
s=1 


This inequality is all the more preserved if the sum above is replaced 
by the Fourier series of the element y with respect to the orthonormal] 
system [121]: 

AsEqx 
V Ase (A) 
(s=1,2,...,p +1). 
Now, by (251), inequality (255,) becomes 


Pil | Avg, (a) 

yl? — < &, 
= Ae (a) 

pti ae (A) |? 


ea > ilylP—e. 
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On comparing this inequality with (253) and recalling that « is 
arbitrary, we see that the strict upper bound of the sums appearing 
in inequality (252) is equal to || y ||*, ie. (254) holds. By using this 
formula and the formula 


1+ Eyl? + lel), 


@ more general formula is obtained for y and z belonging to C,: 


(y, 2) ="5 lly tale +s ly + ele — 


M ———e 
dy (A) de, (4) 
y2)= | a (256) 
where i. 
Pz (A) = (2, S,2) : (257) 


In order to deduce similar formulae for a bilinear functional, we 
prove the following theorem. 

THEOREM 2. If y € Cy, then &, y and Ay also belong to C,. 

Since y € C,, either y is a finite linear combination of elements 


#,, wv: 
y= S080, (258) 


s=! 


or y is the limit of such linear combinations. In the former case: 
p 
Gy = S66, G12. 
s=1 


But, by (180), F,F;, => Bo, for A < A, and 6,6, = &,, for A = Nes 
ie. Fy is also a finite linear combination of elements of C,. and &, y € 
€ Cy. If y is a limit of finite linear combinations of elements of C,: 


Ps 
y= lm > OEMs, 
N~woo sa] 


then 
Da 
Fy = lim SMS, Fx , 


Moros s_} 
ie. , y is also a limit of finite linear combinations of C,, so that in 
this case also 7, y € C,. By (205), the element Ay is a limit of finite 
linear combinations of #, y. Any 2, y € C, by what has been proved, 
so that every finite linear combination of #, y and the limit of such 
linear combinations also belong to C,, i.e. Ay € Cy, and the theorem 
is proved. 
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We can therefore write (256) with y replaced by Ay. 
Now, (A) is replaced by the function 


M M 
Pa, (A) = (Ay, G2) == fu (F,y, f,2) a f ud, (y, oF ,2) ’ 
m m 
and we obtain, on taking (180) into account: 
a 
gpa, (4) = J ude, (4) » 


and (256) gives us 
M 


A Bs 
i d {ud.g, | d 9, (A) 
(Ay. 2) = r= egy 


or, on recalling the property of Hellinger integrals [83], we have the 
formula 


dg, (4) dp, (A) 
= 1 ee OPA) OFA) (259) 


(Ay, 2) dg (A) 
Further, by (254), the expression on the right-hand side of (251) 
tends to zero as the sub-intervals 4, become indefinitely smaller, 
so that 
7 Ay (A) 
= Aa A, OX => Y. 
The terms of this last sum are elements of H, and the limit of this 
sum may naturally be written as a Hellinger integral, in the same way 
as previously for ordinary sums: 


dfx. (yEC,) (260) 


If we apply this formula to the element Ay instead of y, we obtain 
by similar arguments: 


dy, (A) 
ayahs ene dex (261) 


or, as the limit of a sum: 


n 


450 HILBERT SPACE [148 


Notice that we can use similar sums and passage to the limit in 
H to obtain a general definition of the Hellinger integral for elements 
of H. Now, instead of applying (256) and (260) to the element Ay, 
let us apply these formulae to the element 2, y, which also belongs 
to Cy, yw being any fixed number of the interval [m, M]. We have 


(y, @,2) forA<p 


< 
A = o oO = ’ o g = = 
Pepy ( ) ( wy ax) (y be 4X) | (y, & 2) for A > BL, 


A) for A<u, 
Pepy (A) =| Py (4) e 
gy (u) for AD>yp, 

and the above-mentioned formulae at once give us 


fad 


doy (A) de, (4) 


(Fy, 2) = | — a (268) 
* dp, (2) 

= Py : 264 

a] Sea dz x (264) 


Notice that (256) is equivalent to the generalized closure equation, 
whilst (259) and (261) are equivalent to (241) and (240) of the previous 
section. The formulae of the present section have been deduced on 
the assumption that y and z¢€C;,. A self-conjugate operator A is 
said to have a simple continuous spectrum if there exists an element 
xz of H such that C, is the same as H. If this is the case, and we take 
as x the element just mentioned, our formulae hold for any elements 
y and z of H. 


148. Invariant subspaces. Before investigating anon-simple spectrum 
and a mixed spectrum, i.e. the case when the eigenelements exist 
but do not form a closed system, we must introduce a new concept 
and prove certain facts. 

DEFINITION. A subspace L is said to be invariant under the operator 
A when the following condition is satisfied: if x € L, then also Ax € L. 
Alternatively, L is said to reduce A. 

The meaning of the definition is as follows. If Z reduces A, A can 
be regarded as an operator separately defined in L, where L may 
be finite-dimensional or may be taken as a Hilbert space. In other 
words, an operator A, defined in the whole of H, induces an operator 
defined in LZ, which coincides with A for elements of L. The considera-- 
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tion of A on separate invariant subspaces simplifies investigation of 
it. Notice that, if A is a self-conjugate operator in H, it is obviously 
a self-conjugate operator in any invariant subspace for it. Our future 
investigations will only cover subspaces invariant under self-conjugate 
operators. 

THEOREM 1. If the subspace L reduces the self-conjugate operator A, 
the complementary subspace H © L also reduces A. The necessary and 
sufficient condition for L to reduce the self-conjugate operator A is that 
the projector P; commutes with A, i.e. 


If Z reduces A, Az € L when z€ L. We have to show that, if 
x | L, then Av | L. Let z be any element of Z. Then Az ¢€ L, and 


we have 
(Az, z) = (xz, Az) =0, 


and our assertion is proved. Let us turn to the proof of condition 
(265). We write down the obvious equation 


At=AP,4+A(E—P,)a. 


If Z reduces A, then A(P, x) € L, and by what has just been 
proved, A[(# — P,) x] € HOL, so that the first term on the right- 
hand side is the projection of Ag into L, i. P, Ax = AP, « for any 
x, and the necessity of (265) is proved. Conversely, let (265) be satis- 
fied, and 2 € L. Now, Av = A(P, x) = P, (Az), ie. Axe L, and 
the proof is complete. By using Theorem 2 of [143], the following 
immediate corollary is obtained for our present theorem: 

CoroLuaRy. The necessary and sufficient condition for the subspace 
L to reduce A is that it reduce @, for any A. 

THEOREM 2. If the mutually orthogonal subspaces L, (k = 1, 2, ..-) 
reduce A, their orthogonal sum 


L=L@L,0OL,@... 
also reduces A. 
By hypothesis, A commutes with all the P,,, and hence commutes 
with their sum: 
P,=P,+Pi,t+Pyt--s 


which proves the theorem. Theorem 2 also holds when A is not self- 
conjugate. 

Some simple facts must be mentioned in connection with the concept 
of subspace invariant under a self-conjugate operator. If the projector 
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Py commutes with the projector P,, then L reduces Py, and the 
operator Py induces into L the projector P,y in the subspace 
LM. Further, let Z reduce A, and hence reduce its spectral function 
%,. Let A and %Y denote the operators which are induced into L by the 
operators A and &,. It may easily be shown that &{! is a resolution 
of the identity for A”. If x € Lin (205), we canreplace A by A™ and @, 
by 2%, and ZY is the spectral function of the operator A™ defined 
in LZ, By (223), Z also reduces the resolvent #; of operator A, where 
#, induces into £ the resolvent of operator A, Let AD, A@, FY 
and Y be operators induced by the self-conjugate operators A and 
@, into the invariant subspace LZ and the complementary subspace 
HOLE, whilst r=2,+ 2, and y= y, + y, are decompositions of 
and y onto L and H OL. We have the obvious equations: 


At =AWz, + AQ; Fe = Fx, + FPa, ; 
(Ax, y) = (AMx,, y) + (AM x, y). 


Similar formulae hold when H is decomposed into a finite or de- 
numerable number of mutually orthogonal subspaces reducing A. The 
whole of space H, and the zero subspace, i.e. the subspace that only 
contains the zero element, are trivial invariant subspaces for any 
operator. If an operator has no other invariant subspaces, it is de- 
scribed as an irreducible operator. Every subspace of eigenelements 
corresponding to a given eigenvalue A, of an operator A is a subspace 
invariant under A, and in this subspace the operator A reduces to multi- 
plication by the number A). If x, is any given eigenelement correspond- 
ing to the eigenvalue A,, the set of elements of the form az,, where 
a is any complex number, is also a subspace that reduces A. If L; 
(k = 1, 2,...) are all the subspaces of eigenelements of a self-con- 
jugate operator A, their orthogonal sum H’ reduces A. Let A’ and 
@ji be the operators induced into H’ by the operators A and @,. 
We have, by (234) and (236): 


A't = SS A(F,,— Fu-o) 2, | 
. (267) 


(266) 


Gio = ae, (Fy — Fn,~0) aes =, (Fa, — F »,-0) x, 

eS 
ie. 7; amounts to the sum of the jumps of the function &, at the points 
A, satisfying A, < A. The operator #7, induced by @, into the subspace 
H” complementary to H’, is a projector into the subspace H’M,, 
where M, is the subspace corresponding to the projector ,. Any 
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element of H” is orthogonal to all the Ly, ie. (7, — &,-») x = 0 
if x € H’, and, given any x belonging to H”, we can write 37 as the 
difference 
Gi =8,— (Fu — Fx-0), 
aSA 

so that 2{ is continuous for all 4. Therefore, if A does not have a 
purely point spectrum, the subspace H” contains non-zero elements, 
the spectral function is continuous in it, and the operator has no 
eigenvalues at all in H”. The eigenelements in H’ form a closed system, 
and the operator A has a purely point spectrum in H’. 


149, The general case of a continuous spectrum. We have seen 
that, if an element y belongs to the subspace C, formed in [147], 
Ay € C, also, ie. C,, reduces A. If the operator A does not have a 
point spectrum, i.e. its spectral function 2, is continuous for all 
values of A, and its continuous spectrum is not simple, the subspace 
H is expressible, as we shall show, as an orthogonal sum of subspaces 
of type C,. In each of these subspaces the operator induced by A 
will have a simple continuous spectrum, and the formulae of [147} 
hold for elements belonging to such a subspace. The corresponding 
formulae for any elements of H will be obtained with the aid of a 
decomposition of the element over the above-mentioned subspaces, 
and, by (266), the formulae are arrived at by adding the corresponding 
formulae for the individual subspaces. The expression for H as an 
orthogonal sum of subspaces of type C, is deduced on the assumption 
that space H is separable. Let us take any closed orthonormal 
system 


Uy, Ug, Ug... 


We form Cy, by putting y, = u,. The element uw, can be written in 
the form u, = v, + y, where v, € Cy, and y% | Cy. If y, #0, Cy, is 
obtained. Let us show that Cy, 1 Cy. By hypothesis, we have 
(Y2, $,Y,) = 0 for any A, since F, y, € Cy. We have further: (2, y, 
F241) = (Yo F, $1 Y) = 0. Hence it follows that any linear combina- 
tion of elements 2, y, is orthogonal to any linear combination of 
@, y,, and in the limit, any element of C,, is othogonal to any element 
of Cy. We next take the element w, and write it in the form u, = 
= 03; + ¥z, where v,€ Cy OCy, and y, | Cy, OCy,, and form Cy. 
It may be shown as above that Cy, | Cy, and Cy, 1 C,, and so on. 
We obtain in this way a finite or denumerable number of mutually 
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orthogonal subspaces Cy,, and since each element x of H can be 
expanded in elements u, that form a closed system, the orthogonal 
sum of subspaces Cy, must give the whole of H: 


H=06,,00,,06,,0... (268) 


The formulae of [147] hold in each of the subspaces Cy,, and we 
can therefore write for any elements y and 2 of H: 


dy, En Yk) D (Ed Yes 2) 


= 2 i de, (A) , ey) 


Be 


_ *d (ys Ex Yu) d (Ey, Yeo 2) 
(Fy, 2) =e] SA (270) 
m 
7d (ys E,U4) 4 (Ex Ym 2) 
En 9K) F(En Ye 
(Ay, S| aa (271) 
m 
M M 
d (y; & yx) d (y, &4 yx) - 
y= 2 | dey Fn AY = z/ “Te 8% xYe (272) 
f° TP oe (273) 
oy = ~ do, (7 (A) ako 
where o;(A4) = || &, y, ||?, and the sums written may be either finite 


or infinite. In the latter case, the convergence of the series in (272) 
and (273), containing elements of H, has to be understood as a con- 
vergence of elements of H. 

The above method of forming the subspaces C, can be written in 
terms of the formulae that follow. If v is any element of H, its pro- 
jection onto Cy, is defined by the corresponding terms of formulae 
(272), i.e. 

{ d.8.ye) 


“doy (ty 28a Ye 
m 


Oo, = 
and we obtain for &, v,, by taking (180) into account as usual, 


* a(v, 1 Ene) age 
— s 2 
Bite = | aera doy (1) ue ee) 
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This remark leads directly to the following formulae, which cor- 
respond to the process described above for forming the subspaces C),: 


Sd (te Says) ” 
FY, =F, UF, Y, = Fu, — by { ep sf dey (275) 
s=1 im 40s (#) 

Different subspaces Cy are in general obtained with a different 
choice of initial system u,. In fact the number of these subspaces 
may turn out to be different. It is useful to extend these subspaces 
as far as possible, right from the start. 

It can be shown that a construction of Cy, is possible, such that the 
following condition is fulfilled: every set of measure zero with respect 
to e,(A) is a set of measure zero with respect to @p4;(A), @p42(A), 
also. In view of the results of [74], this condition is equivalent to 
the following: every @,(A4) is expressible in terms of the preceding 
ox(A) by 

A 


Oy (A) = { o (A) do, (a) 


m 


(k= 1,2,...,p—1), 


where the integral has to be understood as a Lebesgue—Stieltjes, and 
the g°(A) are non-negative functions, measurable with respect to 
o,{4) and summable. When this condition is fulfilled, we shall say 
that the subdivision of the spectral function is normal. It can be 
shown that the number of subspaces is the same in different normal 
subdivisions. We shall return later to this question. 


150. The case of a mixed spectrum. As already mentioned, a self- 
conjugate operator is said to have a mixed spectrum if eigenelements 
of the operator exist, but the orthonormal system of eigen- 


elements 
Lis ay Mags 3 (276) 


is not closed in H. As above, let L,, be the subspaces of eigenelements 
corresponding to the eigenvalue A,. As we saw in [148], in the invariant 
subspace 


Ha OE GIO. 


the operator has a purely point spectrum with eigenvalues 4, and 
subspaces of eigenelements L,. Elements (276) here form a closed 
system in H’, consisting of eigenelements of A. 
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The operator A has only a purely continuous spectrum in the com- 
plementary subspace H. By making use of the formulae of [149] 
and [146], the following general formulae are obtained, in which the 
sums are due to the point spectrum in H’, and the integrals due to 
the continuous spectrum in A”: 


Md (y, 6,44) 4 (8, Yip 2) 
(y,z)= Sab, + Bl kA 277 
Y, 2) = KOK 2 do, (4) ( ) 
3d (Ys Ex.) dE, Yeo 
(Ay, 2) = Sait + (2 aE (278) 
(y, & 
y= Sant Ss i AO AW dB Ye (279) 
M dy, é 
Ay = > et, + > AE EAU AB Ye (280) 
k k im Og (A) 


where a, and 6; are the Fourier coefficients of elements y and z with 
respect to system (276) and the yw, are the eigenvalues of A, cor- 
responding to the eigenelements 2,. 

By using the above subdivision of the operator into an operator 
with a purely point spectrum in H’ and one with a continuous spec- 
trum in H”, we can make a classification of the points of the spectrum. 

Derinirion. We say that i, belongs to the point spectrum if A, is an 
eigenvalue of A. If A, is a limit point of a point spectrum, t.e. there are 
eigenvalues different from A, in any e-neighbourhood of it, 2, is said to 
belong to the limiting spectrum. Finally, 1, is said to belong to the continu- 
ous spectrum if A, belongs to the spectrum of the operator A” induced 
into H” by A, i.e. if the spectral function Jj of A is not constant in any 
interval containing 2, as an interior point. 

Every point of the spectrum of the operator A belongs to at least 
one of these three categories, though it may happen that say A, belongs 
to all three categories simultaneously. A further concept is sometimes 
brought in, viz. J, is said to be a point of condensation of the spectrum 
if it is either an eigenvalue of infinite rank or an element of the 
limiting spectrum or an element of the continuous spectrum. 


151. Differential solutions. Let us take an operator with a purely 
continuous spectrum. Given any 2, the elements @, x satisfy an equa- 
tion analogous to the equation Az = Az in the case of a point spec- 
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trum. Let A[A,, A,] be any interval. We can use property (180) to 
write the following analogue of equation (213): 
M as 


(Ade, 4%, = { Ade, 
m A, 


We bring in the element 2(A) = &, x, continuously dependent on 
the parameter 4 in the interval [m, M] in the sense that || z(4,) — 
— 2x(A) || > 0 as A— A, for any A, of [m, M]. It will be seen from the 
above equation that x(A) satisfies 


A [Ax (A)] = { Adz (A). (281) 
A 


In this case, x(A) is said to be a differential solution of the equation 
Ag = Az. The vectors y,(A) = %, y, formed in [149] are therefore 
differential solutions. Given different k, they lie in the orthogonal 
subspaces Cy,, and we therefore have, for any intervals 4, and A,: 


(Ayyp (A), 424, (A)) = 0 (282) 

(p#q). 
If 4, and A, have no common interior points, we have, by (180) 
(AyYp (A), AoYy (A)) = 0 (283) 


(A, and A, have no common interior points). If 4, and A, havea common 
part 4,,, by (180), 


(A1Yp (A), Any (A) = || 4,24 (A) I°. (284) 


We now bring in the concept of a complete system of differential 
solutions. Any system y,(A) of differential solutions, which is orthogonal 
in the sense of (282), is said to be complete if the element zx is ortho- 
gonal to all the y,(A), i.e. satisfying 


(yp (A), 2) =0, (285) 


for any p and any A, is the zero element. It may easily be shown that 
the solutions y,(4) = %,y, formed above are a complete system. 
For, we can conclude from (285) that the element z is orthogonal 
to the subspace Cy which is the closed linear envelope of y,(A) = 
= @, yp, and this is true for any p. But the orthogonal sum of the 
Cy is the whole of H, and z is therefore orthogonal to the whole 
of H, i.e. is the zero element. 
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We started out from the spectral function #, when forming the 
solutions of equation (281). We shall now start out from the equation 
itself. Suppose we have somehow succeeded in forming the solution 
a(A) of (281). Since 2(A) appears in this equation only under the dif- 
ference and differential signs, we can obtain further solutions by 
subtracting any element independent of A from 2(A). In particular, 
x(A) — x(m) will be a solution, and we can therefore always assume 
that a(m) = 0. We shall show below that any solution of (281), 
continuously dependent on 4A in the interval [m, Jf] and satisfying 
the condition 2(m) = 0, necessarily has the form 2(4) = &, x. Suppose 
we have somehow succeeded in forming a finite or infinite number of 
solutions y,(A) of (281), mutually orthogonal in the sense of (282). 
By what has been proved, each of them has the form y,(A) = 21 yp, 
where yp is an element of H. The closed linear envelope of each y,(A) 
is a subspace Cy, these subspaces being orthogonal, by (282). The 
completeness of solutions y,(A) is equivalent to the fact that the 
orthogonal sum of the Cy is the whole of H. If these solutions in 
fact form a complete system, we can write the formulae of [149} 
with &, y, replaced by y,(A). We can therefore obtain these formulae 
by starting out from any desired orthogonal complete system of 
continuous differential solutions. The completeness of a system of 
mutually orthogonal solutions, no matter how constructed, may be 
verified by (271) for a bilinear functional, or by (273), if the spectral 
function @, is known. 

Notice also that equations other than (281) can be formed for 
a(4) = @, x. Supposing that the interval 4 does not contain A = 0, 
we can write, by (180), 


M 
Ae on AE, 


m 


and hence obtain the equation for z(A): 


A [a dx (2) = Ax (A) 
or 


{jal4az (A)] = Az (A). 
4 


Let us now turn to the proofs of the above assertions. 
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THEOREM. Every solution of equation (281), continuous in the interval [m, M] 
and equal to the zero element when 4 = m, has the form x(A) = €, x(M). 
Given that x(m) = 0, equation (281) gives 
a 
Suda (u) = Ax (a), 


m 


« being the variable of integration, both here and below. Having fixed two num- 
bers p, < #, in any manner, we obtain 


A 
§ wd(a (tu), © (My) — @ (y)) = (Aa (A), @ (42) — @ (fy))- 


m 
Using (281) and the fact that A is self-conjugate, we can write 


Uy 
(Aa (A), © (2) — @ (Hy) = (aw (A), Aw (Hg) — Aa (1)) = J ude (A), @(H)), 


My 
and hence we arrive at the equation 


a Me 
§ ue d(w (uw), & (Hg) — @ (1) = J wd (@ (2), @ (x). 


m wy 


The integral on the right can be integrated by parts, and the mean value 
theorem applied to the resulting integral: 


a 


fw d(a (u), & (M2) — @ (4) = 
== fl, (aw (A), & (He)) — py (@ (A), @ (y)) — (Me — Hr) (@ (A), © (Hg) 
(ug belongs to [M, #,]), 
or 
A 
fwd (a (1), © (Hg) — @ (444)) = 
m 


= (Hy — ft) (w (A), © (Mg)) + My (& (A), @ (oe) — © (Hy) — (He — Hy) ((A), (Hg). 


and we can rewrite this last as 


A 
fe —myya 2Os Se) = 2) (a (4), 2 (ag) — 2 (Hs) 
is Ba — By 


On introducing the continuous functions 


joe OU P(or)) 4 (a) = (wld), wo) — 2 (ty)), (286) 


@ (4 


this can in turn be rewritten as 


a 
§ (a — my) dw (u) = 7 (A), 


m 
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where we assume 4 < py, < 4, and obviously, w(m) = 0. This last equation 
is readily solved for w(s). All we need do is integrate the left-hand side by 
parts, which gives w(A) = f(A)/(A — uy) + u(A), where u(A) is a new required 
function, equal to zero when 4 = m: 


or, on recalling the notation of (286), 


(aw (A), © (Hy) — © (44) _ (@ (A), © (ta) — & (Hy)) 
Me — fy A— wy 


Aa 
(x (1), © (Hy) — & (Jt) 
y) (ue — fy)? - 


If fy, — fy, then uz— fy, and the first term on the right-hand side tends to 
zero. The same can be said of the integral, since | (1), (42) — x( fs) | < 
< C || 2( 4,2) — x( fs) ||, where C is the greatest value of || (1) || in the interval 
[m, M1]. We have taken the case when 4, > jf, from smaller values. A similar 
approach can be used when 4 < yw, < yw, and “4, 4. It therefore follows 
from the last formula that 


d 


aa x (A), x (u)) =0 for u> A, 


and hence 
(a (A), a (1) = (a (A), @ (A)) = |] (A) ||? for w > A 


On applying this formula with » = M to the solution y(A) = a(4) — &, 2(M) 
of equation (281), which vanishes when 4 = m and A = M, we get || y(A) ||? = 0, 
ie. 2(A) == €, 2(M), and the theorem is proved. 


152. The operation of multiplication by the independent variable. 
Let us return to the results of [147] and consider the function 
space LS of the function f(A), square integrable with respect to the 
function (A) defined by (245). The class L$ is the class of functions 
f(A), defined in the interval [m, Jf], measurable with respect to (A), 
and such that 


M 
S14 (A) Pde (a) < 4 0. (287) 


The space LS is a concrete form of space H. In this space, the 
operator of multiplication by the independent variable: 


Ay [f (4)] = Af (A) (288) 


is obviously bounded and self-conjugate, since 


AF (A) || < m1 F() |, 


162] THE OPERATION OF MULTIPLICATION BY THE INDEPENDENT VARIABLE 461 


where n is the greater of the numbers | m| and | M |, and, since 


A is real, 
M 


(Af (4), 9 (A) = (F(A), Ag (A) = J Af (A) g (A) de (A). 
m 
We shall now establish the connection between space C, and the 
function space LS. In view of the existence of the Hellinger integral 
(249), any element y of C, corresponds to a function y(A) of LS such 
that [82]: 


A 
gy (A) = (y, 8,2) = ‘| yu) dol) . (289) 


Distinct elements y and z of C, correspond to distinct elements 
y(A) and 2(A) of LS. For, if y and z were to correspond to equi- 
valent functions y(A) and 2(A), we should have, by (289): (y — z, 
@, 2x) = 0 for any A. The difference y — z would thus be orthogonal 
to all the linear combinations %,2, and on passing to the limit, 
y — 2 would have to be orthogonal to the whole of subspace C,.. 
But y — z € C,, and we should get (y — z,y — 2) = |ly—2z|/P?}=0, 
ie. y =z. Conversely, in the case of two non-equivalent functions of 
LS, the integral in (289) cannot have the same value for all 4 [52]. 
Thus (289) establishes a one-to-one correspondence between elements 
y of C,, and elements of some lineal M of LS. We shall show first 
that M is a closed lineal. By using Lebesgue-Stieltjes integrals, we 
can rewrite (256) as [82] 


M 
(y, 2) = | y(A) 2() de(A) , (290) 


where 2(4) is the element of LS corresponding to the element z of 


C,, i.e. 
a 


(2,82) = J 2(4) do(1) . (291) 


Let yA) be a sequence of elements of M and y” the correspond- 
ing elements of C,. On putting y = z = y — y™ in (290), we get 


M 
Ly — yo [2 = F | yay — yl (A) [2do(A) . (292) 
If the y(4) tend in the mean to some element y,(4) of LS, the 


right-hand side of (292) tends to zero as n and m—> +, go that 
the sequence of elements y is mutually convergent, and there 
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exists an element uw such that y => u, where u€C,, since C, 


is a subspace. Let u(A) be the element of M corresponding to the 
element u in accordance with (289). We shall show that the element 
w(A) is equivalent to y,(A). It will follow from this that y,(A) ¢M, 
ie. that M is a closed lineal. Putting y= z—=—u—y™ in (290), 
we get 


M 
Il» — y |[2 = 2 | w(2) ~ (A) |2 do(A) , 


from which it is clear that the y™(A) tend in the mean to u(A), so 
that the function w(A) is equivalent to y,(A), since the limit in the 
mean is unique. We now show that the closed lineal M coincides with 
L®. If this were not the case, an element f,(4) of LS? would exist, 
not equivalent to the zero element, and orthogonal to all the elements 
of M. Since 2,8, = @, for » < 4 and %,%, = &, for vy > A, (289) 
takes the form for the element y = @, 2: 


v 


(F,2, 8,2) = («,6,8,2) =||F,F,2||? = [do(a) for A>», 
a m 
(Fx, 8,2) = | do(A) for A <», 


i.e. the function {(A) of Mf that corresponds to @,2 is equivalent to 
the function given by 


f(A) =1 for A<y and f(A) =0 for A>». 


The value of f(A) for 2 = » is not important, in view of the conti- 
nuity of o(A). The orthogonality of f,(A4) to the function just defined 
gives us, for any v: 


:) fo(4) do(A) = 0, 


whence it follows, as we know from [52], that /,(A) is equivalent 
to zero with respect to e(A), and the subspace M must therefore 
coincide with LS. The above discussion gives the following theorem: 

THEOREM 1. Formula (289) establishes a one-to-one correspondence 
between elements y of C, and elements y(A) of Le 

In view of (290), the value of the scalar product is preserved in 
this correspondence, and hence the norms of corresponding elements 
are the same. Moreover, the correspondence is obviously distributive, 
since the scalar product (y, 2,2) is distributive with respect to y, 
and the integral in (289) is distributive. Thus, with this correspondence, 
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the function space LS? is a real form of Hilbert space C,. An operator 
A whose spectral function is @, is defined in this space. Let us prove 
the following theorem: 

THEOREM 2. The replacement of y by Ay in C, corresponds to mul- 
tiplication by u in LS, i.e. the operator A in C, corresponds to the operator 
(288) of multiplication by the independent variable in L¥°. 

By using (206) and (289), and a property [75] of the Lebesgue- 
Stieltjes integral, we can write, on the assumption that y € C,: 


M A 
(Ay, &, 2) = J ud, (%,, y, F, 2) = J nd,(y, 8,2) = 


i.e. 
(Ay, 8,2) = fi pyle) dolye) 


whence it follows, on comparing with (289), that replacement of y 
by Ay corresponds to multiplication of y(u) by py. Notice also that 
the general formula (259) for a bilinear functional (Ay, z), where 
y and z € C,, can be written, using (290) and the theorem proved, 
with the aid of the Lebesgue—Stieltjes integral as 


M 
(Ay, z) = § Ay(A) 2(4) da(A) . (293) 


153. The unitary equivalence of self-conjugate operators. Let 2, 
be a resolution of the identity for the self-conjugate operator A, U 
be a unitary operator and B = UAU~—!. As may easily be seen, the 
operator {= U2, U-* is also a resolution of the identity. Let B’ be 
the corresponding self-conjugate operator, so that B’x is defined as 
the limit of a sum: 


> », A,(U&, U-?) nm Os (> », 4, F,) (8 aaa on 
k=l k=l 


whence it is clear that B’ coincides with B, ie. i = UF, U-' is 
the spectral function of the operator B. It follows from this that self- 
conjugate operators that are unitary equivalents must have the same 
spectrum. 

In the case of a purely point spectrum, coincidence of the eigen- 
values and their ranks is sufficient as well as necessary for unitary 
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equivalence. The unitary operator U is easily formed as the operator 
transforming the subspace of eigenelements of B into the subspace 
of eigenelements of A corresponding to the same eigenvalue. The 
question of the conditions for unitary equivalence become much more 
complicated in the case of a continuous spectrum. We shall quote 
without proof the fundamental result relating to this case (the space 
is assumed separable). 

The necessary and sufficient conditions for unitary equivalence of 
two self-conjugate operators A” and A® are as follows: (1) the spectra 
of the operators are of the same type (purely point, purely continuous, 
or mixed); (2) in the case of a point spectrum, it consists, for both 
operators, of the same eigenvalues with the same rank; (3) in the 
case of a continuous spectrum, the number of invariant subspaces 
in the normal subdivision of the continuous part of the spectral func- 
tion is the same for the operators, and if the functions 


| FP YY |? = GA) and || FS y? 


which we formed in [149], are brought in for the normal forms of the 
continuous spectra of A™ and A®, the set of measure zero with 
respect to ¢f(A) must be a set of measure zero with respect to of?(a) 


and vice versa, i.e. for any &, 


>= f(A), 


2 2 
of(A) = 4 yM(d) de@(A) and o(A) = J pP(A) de (A) , 


where g{)(A) is measurable with respect to oj(A), non-negative and 


summable, and similarly for pA). 


154. The spectral resolution of unitary operators. Let A be a self- 
conjugate operator and @, its spectral function, where 


%,=0 for 4=0 and 6,=£# for A=1. (294) 


We form the operator U in accordance with the formula 
1 
y= f erh de, = eta, (295) 
0 


The conjugate operator can be formed simply by replacing the 
function e”*” by its conjugate e-*7"* 1143], ie. 


1 
U* = ‘ elt dz, , 
6 


155] FUNCTIONS OF A SELF-CONJUGATE OPERATOR 465. 


and, in view of (214), we arrive at the equations VUU* = U*U = 
= @, ie. the U given by (295), with condition (294), is a unitary 
operator. The converse will be stated without proof. 

THEOREM. If we take all the possible resolutions of the identity, satis- 
fying conditions (294), formula (295) represents the general form of a 
unitary operator, where distinct unitary operators U correspond to 
distinct resolutions of the identity @;. 

In [143] we defined the function /(A) of a self-conjugate operator 
A, corresponding to a continuous function f(t). We shall generalize 
this definition in the next section to a wider class of functions /(é). 


155. Functions of a self-conjugate operator. Let A be a self- 
conjugate operator and @, its spectral function. If f(A) is continuous 
in the interval [m, M], the operator f(A) has been defined by 


M 
= J f(A) a&, 


or by the equivalent formula for a bilinear functional: 


(f(A) 2, y) = ‘ f(A) (2, y) ; (296) 


where (@,2, y) is a complex function of bounded variation of 4. 
As we know [125], it is a linear combination of four non-decreasing 
functions of the form || #, 2 ||?, where z is an element of H. Thus, 
if (A) is any bounded function, measurable with respect to the non- 
decreasing functions 


| #2 ||? (297) 


for any choice of z, integral (296) exists for any 2 and y, and a bilinear 
functional (/(A) z, y) is thereby defined. This functional is clearly 
distributive, since (, 2, y) is distributive. Let us show that the 
functional is bounded; we shall assume that f(A) is real, though this 
is not actually essential. Since f(A) is bounded, | f(A) | < C, where 
C is a positive number. On putting y = 2, we arrive at the expression 
for a quadratic functional: 


A) 2,2) = S40) yd || F,2 |[?. (298) 
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Remember that, if %,, is non-zero, the last integral is equivalent 
to the sum 


M 
m) [FnelP+ J AAA eel, 


where the integral is understood as the ordinary limit of a sum- 
Since | f(A) | <C and || @, x || = || xz ||, we have for integral (298): 


|(f(A) 2, 2)| < Cll |P. (298) 
We have further, on using @ to denote the real part: 


1@ § fa d(f,.2,y) = 


M 
= { fade +), 2 +9) — { fl) AB x — ye — 9), 


or 
M 


RS fay d(B,2.y) = 
M ie M 
= J f@allBle+y) 2 J (all Be—y)|P, 
and, by (298,), 
sa 70) dF, 2,y)|< Cle + yl? + ie —y|2) = 2C[|| «|? +I] ¥ |P1, 


On arguing as in [122], the same expression as above can be obtained 
at once, but without the sign of the real part: 


4| 1) a(& 2, y)| < 20[]| x|]? + ly IPI. 
When || x || = || y [| = 1, we obtain 
M 
\((A) 2,9) | =| J fA) aB, 29) | <e. (299) 


If x and y have an arbitrary norm, we can write 
= . é ee he 
(4) 2.9) = Nell yi (44) TSP. Toy) 
where the norm of the elements 2/|{ 2 || and y/||y|{ is unity, and, 


in view of (299), 
\7A)a 9) <Clle]] yl. 
whence it follows that the bilinear functional (f(A) x, y) is bounded. 
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We thus arrive at the following fundamental result: if f(A) is a 
bounded function, measurable with respect to non-decreasing functions 
(297) with any choice of 2, then (296) defines a linear operator /{(A). 
Some properties of the operators /(A) must be mentioned. If f(A) is 
real, it follows at once from (296) with y = z that f(A) is aself-conjugate 
operator. If (A) is complex, the conjugate operator /(A)* is obtained 
from (296) with f(A) replaced by the conjugate function. If f(A) > 0, 
it follows from (298) that /(A) is a positive operator. Let us take the 
function /,,(4), defined as follows: 


f(A) =1 for A<yp and f,(A) = 0 for A> yp. (300) 


This is obviously a B function, and we can form integral (296). 
On subdividing the domain of integration into [m— eo, u] and 
[u, I], we get 


B 


(f,(A)2,.y) = fd, x.y) = (F,2,y), 


m 


whence it follows that 


B= 1,A), (301) 


This formula is obtained on the assumption that every self-conjugate 
operator A has a spectral function %,, in terms of which it can be 
expressed with the aid of (204), ic. when deducing (301) we have 
taken as our basis the theorem of [142], which is not yet proved. 
The proof will amount in essence to defining for any self-conjugate 
operator A the function /,(A), without making use of the spectral 
function &,, after which we prove the fundamental formula (204) 
by putting %, = f,,(A). 

The familiar properties of the Lebesgue-Stieltjes integral can be 
used to obtain quite easily the properties of a function of a self- 
conjugate operator A. It will naturally be assumed here that all the 
functions /(A) discussed belong to the class defined above, i.e. are 
bounded and measurable with respect to functions (210) with any 
choice of 2. 

TueorEeM 1. The operator corresponding to a linear combination of 
functions: a, f,(4) + 2 f(A) +... + ap fp(A) ts a f(A) + a fo(A) + 
+... + 4)f,(A). 
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2. f(A) commutes with &, and A. 
3. We have 

(f(A) © f(A) y =f (A) F(A) (Fa, y)- (302) 
4, The operator areas to the function f,(A) f,(A) és 


f(A ) f(A = f(A) f(A ). (303) 
Let us show e.g. that f(A) commutes with @,: 
M 
((A)B, 2,9) = J fA AEF, 24) = iG d(F,2,8,y) = 


= (f(A) 2, 8, y) = (€, HA) 2,9), 


whence it follows that f(A) %, = &, f(A). Notice also that, by (180), 
it follows from the above formulae that 


(f(A) F,2,y) = Si) d(F, 2, y). (304) 


Formula (302) is obtained from the following chain of equations: 


M M 
(fil) @, fol A) 9) = § f(A) (Bye, fal A) y) = J fA) AGA) Fay, 2) = 
M Do ae M 
=S hi play als hw ae, yx) =f f(A) fA) AE,2,y). (305) 
Finally, 


M 
(f(A) f2(A) @, y) = f AA) ) d(%, f.(A) 2, y) = 


M 


M 
= Spd fi fal) AB, 2,4) = F AA) f(A) AB 2, 9) 


and similarly for the product /,(A) f,(4). It can be shown, precisely 
as in [143], that /(4) commutes with any operator B that commutes 
with A. The converse is also true, i.e. if a bounded linear operator 
C commutes with any operator B that commutes with A, there 
exists a function f(A) such that C = f(A). The proof of this important 
proposition can be found in F. Riesz’s article Functions of Hermitian 
operators in Hilbert space (O funktsiyakh ermitovykh operatorov 
v gil’bertovom prostranstve) (Uspekhi matematicheskikh nauk, IX). 
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If we put f,(A) = f,(A) and y = z in (305), we obtain 


M 
AA oP = J AAPAlS,2I/. (306) 


Some further simple facts may be mentioned, in regard to functions 
of a self-conjugate operator. If | f(A) |= 1, then f(A) is a unitary 
operator. If f(A) takes only the values 0 and 1, f(A) is a projector. 
Let us show that, if 1 = A, is an eigenvalue of A and 2, a correspond- 
ing eigenelement, /(A,) is an eigenvalue of f(A) with the same eigen- 
element x). We know that @, x, = 0 for A < A, and @,2, = 2, for 
A > A,, and (296) gives us for any y: (f(A) 2, y) = f(Ao) (Xo, y) = (f 
(Ay)Z, y), whence, since y is arbitrary, we have f(A) 2) = f(A) Zo. 
Notice that, if f(A) has a finite number of discontinuities, it is 
measurable with respect to all the functions (297), so that f(A) has 
a definite meaning. The same will be true when f(A) is a B function 
[47]; this fact has already been used above. 


156. Commuting operators. We must now consider the problem of commuting 
self-conjugate operators. 

THEoREM 1. The necessary and sufficient condition for two self-conjugate 
operators A and B to commute is that their spectral functions 6, and F',, commute 
for any A and uw. We know that the spectral function of any self-conjugate 
operator C commutes with C, and with any operator that commutes with 
O [143]. It follows from this that, if AB = BA, F,, commutes with A, so that 
&, commutes with F',,. Conversely, if €, commutes with F',, the Riemann— 
Stieltjes sums in the integral forms of operators A and B commute, so that 
the operators themselves commute. 

THEorEM 2. If the self-conjugate operators A, B and C, having purely point 
spectra, commute in pairs, there exists a closed orthonormal system of elements 
which are eigenelements of these operators. 

Let &,, F,, and G, be the spectral functions of the operators. By Theorem 1, 
they commute in pairs. Let A, 4 and » be any given eigenvalues of A, B and 
C, and L,, M, and N, be the subspaces of the corresponding eigenelements, 
whilst 4, = €,— 6,93 47 = F,—F,y-93 4°.= G,—G,_» are the projectors onto 
these subspaces. These projectors mutually commute, so that their product 


Aiuy = 440 Ay” 


is the projector onto subspace R,,,, which consists of elements common to 
L,,M,, and N,[140}. If we take two distinct subspaces R,,, and F,,,,,,,, at least 
one of the number pairs (A, A’), (4, u’) and (v, »’) consists of different numbers. 
Suppose say 4 #4 4’. Now, if x ¢ R,,, and2x’e R,,,,,,,, then 2 and a’ are eigen- 
elements of A corresponding to different eigenvalues, i.e. they are mutually 
orthogonal. The subspaces R,,, are therefore mutually orthogonal. We must 
show that their orthogonal sum is the whole of H. All we need do is show that 
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no non-zero element exists, which is orthogonal to all the subspaces Riayy i.e. 
we need only show that, if an element x, is non-zero, it is not orthogonal to at 
least one of the R,,,. This is obvious for the subspaces L,, M, and N,, since 
A, B and C have purely point spectra by hypothesis, so that the orthogonal 
sum, say of L,, is the whole of H. Let us take an element xz, #4 0. By what 
has just been proved, an eigenvalue A of the operator A exists such that 
4,2, % 0. Further, by the same arguments, an eigenvalue yz of the operator 
B exists such that Bi( Ay %o) #0, and an eigenvalue of C exists such that 
A” (4; 44%) # 0, whence it follows that x, is not orthogonal to R,,,. There- 
fore, the orthogonal sum of R,,, is the whole of H. If we take a closed ortho- 
normal system in each R,,,, we get an orthonormal system, closed in H, 
and each element of it, belonging to some #,,,, is an eigenelement of each 
of the operators A, B and C. The proof of the theorem is exactly the same for 
any finite number of mutually commuting self-conjugate operators. We saw 
above that different functions of the same self-conjugate operator are commut- 
ing operators [155]. Let us now prove the converse, for the case when the opera- 
tors have a purely point spectrum. 

THEOREM 3. If the self-conjugate operators A, B and C, having purely point 
spectra, commute in pairs, they are functions of the same self-conjugate operator D. 

Space H is assumed separable. By Theorem 2, there exists a closed ortho- 
normal system of elements 


Pie Peres 
which are eigenelements of A, B and C, ie. 


Aty = Ay tys Biq = My Xp Cty = %y Ly 


Let @, = 1/m, and let x be any element. It can be expanded in elements 
Up: 


r= >) ay Lp ’ 
k=1 
and we define the self-conjugate operator D by putting 
Da = ek ee 


The series on the right-hand side is clearly convergent, since the numbers 
| a, |? form a convergent series, i.e. | a, o, |? all the more form a convergent 
series. It follows at once from this definition that the 2x, are eigenelements of 
D, corresponding to the eigenvalues 0;, i.e. D has a purely point spectrum. 
We can form the bounded function f,(4), equal to 4, at the points 4 = 9, and 
continuous everywhere, with the possible exception of the point 4 = 0. Simil- 
arly, we can form f,(A) with similar properties, so that f2(e,) = “,, and f(A) 
so that /;(9,) = v,. By the results of the previous section, corresponding opera- 
tors f(D), f,(D) and f,(D) can be formed for the functions f,(A). The operator 
/,(D) has eigenelements 2, and eigenvalues f,(0,) = Ay, where the x, form a 
closed system. The operator A has the same eigenvalues and eigenelements. 
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But if two operators with purely point spectra have the same eigenvalues and 
corresponding eigenelements, their integral forms in terms of the spectral func- 
tion imply that they coincide, i.e. A = f,(D). Similarly, B =f,(D) and C = 
=: f,(D); the theorem is proved. Notice that the proof will be exactly the same 
for any finite number of self-conjugate operators. The theorem can also be 
proved in the case when the operators do not have purely point spectra, but 
we shall not dwell on this (J. Neumann, Annals of Math., t. 32, 1931). 


157. Perturbations of the spectrum of a self-conjugate operator. Remember 
that the points of condensation of the spectrum of a self-conjugate operator 
are defined as the values of A which are either limit points of a point spectrum, 
or eigenvalues of infinite rank, or points of a continuous spectrum. We shall 
prove below the following theorem. 

THEoREM 1. If a completely continuous self-conjugate operator C 1s added 
to a self-conjugate operator A, the set of points of condensation of the spectrum 
remains unchanged, 

Nevertheless, the addition of a completely continuous operator can sub- 
stantially change the nature of the spectrum. In fact, the theorem holds: 

THEOREM 2. We can add to any given self-conjugate operator A a self-conjugate 
completely continuous operator C, with absolute norm not exceeding any given 
positive number ¢, such that A + C has a purely point spectrum. 

The following can be proved by making use of this last theorem: 

THroremM 3. If the self-conjugate operators A, and A, have the same set of 
points of condensation of the spectrum, there exist a unitary operator U and a 
self-conjugate completely continuous operator C such that A, = UA,V~'+C. 

We shall only prove Theorem 1. Two preliminary lemmas are required. 

Lemma 1. Jf A = puis a point of condensation of the spectrum of a self-conjugate 
operator A, there exists a sequence of normalized elements x,, weakly convergent 
to zero, such that 

|| Atty — wt, |] + 0. (307) 


If » is a limit point of a point spectrum or an eigenvalue of infinite rank, 
there exists an infinite sequence of mutually orthogonal normalized elements z, 
such that the corresponding eigenvalues 4, tend to yu. If z is any element, its 
Fourier coefficients c, = (z,2%,) tend to zero, and consequently z, Sat 0, and 
the lemma follows for this case from the expressions 


|| Aaty — #Xy || = || (A — A,E) &_ + (Aq — #) Zq || a }An— | + |] tq ]|=[4n — 4 |- 


Now let yz be a point of a continuous spectrum, and &, the continuous part 
of the spectral function. The difference €/,,, — &,_3, given any small positive 
6, is a projector into some subspace L,. We take a sequence of positive numbers 
6, such that 6, - 0, and a sequence of normalized elements 2, of L,,. We shall 
show that the lemma is also true in this case. By definition of the 6,, we have 
(Chtén — Edn) 2%, = x,, and for any element z: 


(2, Un) = (2, (En+on — Ep—bn) Xn) = ((Epton — Ep—bp) Z Lp) 
[ (2, tn) | < |! (Eaton — En —4,) 2 |] 
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But we have Cité n —~ ©u—t, ~~ 9, and the weak convergence x, 0 is proved. 


To prove (307), we have to use the following obvious formula: 


M 
|| Atty — Hat, ||? = ae (A — pw)? d (ip, X_) = 
M 

= § (A= BH)? d (84 (848, — Subp) Dp» Bq) = 

m—~é8, 

B+6, ; 

= § (A—p)?d i] (62 — a) 2p ||? < 58 || (n+8, — Ep—bn) Ua ||? = 63 + 0. 
u—3, 


Lemma 2. If A= uw is not a point of condensation of the spectrum, there 
exists for any sequence of normalized elements x,, weakly convergent to zero, a 
positive number a such that, for all sufficiently large n, 


|| Az, — ue, || >a > 0. (308) 


By hypothesis, there exists a positive number d such that the spectral func- 
tion €, is either constant in the interval yw —d< ’4< uw -+d, or its variation 
amounts to a jump at the point 4= yu, the subspace L, corresponding to 
this jump being finite-dimensional. We have 


M po-d 
|| Avy — pay |P= J A—w aller? > f A—w)*d|l &zall? + 


m— & Mm— & 


M 
+ § A= pPal] Eye q ||? > 2 | pa a ||? + 4 (|| an ll? — Fu+aea|l?)] = 
utd 
=d—d [ l| Fu-+a Cpl? ~ |] Su-d Xm \|?], 
or 
|| Ax, — wa, ||? > d? — a? ((Cu+d — Ey-d) Lp, Ly). (309) 


If Entd — &,_q = 0, we get (308) by putting a = d. Now let ¢, have a jump 
at A= uy, and let z,, Z:, ..., 2 ,, be a complete orthonormal system in L,,. Now, 


m 
(Enid = E4-a) ty, = PH (Xp 25) zs» 
s=1 


we have (,41¢ — &,-4)%,=>9 since x, ©: and it follows from (309) that 
(308) is satisfied for sufficiently large n, if we put say a = d/2. It follows at 
once from these lemmas that the necessary and sufficient condition for A= u 
to be a point of condensation of the spectrum is that a sequence of normalized 
elements x, exist such that 2, ”, 0 and (307) holds. 

The proof of Theorem 1 is now quite easy. We add a completely continuous 
self-conjugate operator C to the operator A, and let A, = 4+C. Now, 4 = 
= A,-+(-—C), where (—C) is also a completely continuous self-conjugate 
operator. Let 4 = yu be a point of condensation of the spectrum of A, and 2, 
a sequence satisfying condition (307); we now have Cz, => 0, since z, , 0, 
and it follows from || 4, 2%, — ux, || < {| Aa, — po, || + || Ca, || that || A, 2, — 
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— pt, || + 0, ie. A= uw is @ point of condensation of the spectrum of A). 
Similarly, it follows conversely from A = A, + (—C) that every point of con- 
densation of the spectrum of A, is also a point of condensation of the spectrum 
of A; the theorem is therefore proved. 


158. Normal operators. Another particular type of linear operator must be 
mentioned. A linear operator A is said to be normal if it commutes with its 
conjugate [cf. IV, 41], i-e. 

AA* == A*A, (310) 

Self-conjugate and unitary operators represent a particular case of normal 
operators. If we put 


1 1 
A,==~(A+A4*), 4:.=57(A-A*), (311) 


we can express A and A* in terms of self-conjugate operators A, and A,: 
A = A, + 1Ay; A* = A, = tA. (312) 


These formulae have the immediate consequence that the necessary and 
sufficient condition for an operator to be normal is that the self-conjugate 
operators A, and A, commute. If this is the case, the spectral functions ew) 
and e(2) of these operators commute for any A and yw. We define a family of 
projectors ¢,, depending on the complex variable a = 4+ yi, by putting 


é= eee) (a= A+ pi). (313) 


This projector &, will only be variable in some interval A, of the plane of 
the complex variable a, and we shall have formulee for the operator A, precisely 
analogous to the formulae for a self-conjugate operator: 


A= ss addé,; (Az,y)= fSadd (Eg%, y). (314) 
Ay A, 
Let us prove say the second of these formulae. Let the interval 4, be defined 
by the inequalities a< A<b;c< w<d. We have 
) Sadd (642, y) = J JAdyd, (8) 82) a, y) +i SJ udyd, (682) x, y). 
. a tC) 


After forming the Riemann-Stieltjes sums, we can sum over y in the first 
of these integrals, since the integrand is independent of A; it must be recalled 
here that £2) = 0 and é{}) = ZH. Similarly, we can sum over A in the second 
integral, where £() = 0 and &) = EH. We therefore obtain 


6 d 
J Sadd (Ex, y) = Sad (ea, y) +4 J ud(s@a, y) = (Asx, y) + 4 (Axe, y)- 
4, a c 
On the other hand, 

(Ax, y) = (Aya, y) +4 (Ae®, y). 


and a comparison gives us the second of equations (314). A general theory 
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of normal operators can be further developed in analogy with the theory of 
self-conjugate operators. 

Suppose that, given the normal operator A, the self-conjugate operators 
A, and A, have a purely point spectrum. We can take a closed ortho- 
normal system of elements 2, (k = 1, 2,3, ...) which are eigenelements of 
A, and A, [156], ie. 


Aja, = ww Xs Asay = UP) ag. 
Now, obviously, 


and the x; are therefore the eigenelements of A corresponding to the eigen- 
values pi) + pl2). 

Let us also take the case when the normal operator A is completely continuous. 
We know that the operator A* is now also completely continuous, and by (311), 
the operators A, and A, are completely continuous. The theorem of [136] may 
readily be extended to the case of normal completely continuous operators. 
Let yu, be non-zero eigenvalues of A,, and x, the corresponding eigenelements, 
i.e. 

A Xp = £40. 
Recalling that A, commutes with A,, we obtain on applying A, to both sides: 
A, (Azry) = yA 2g, 


i.e. A, x, is either the zero element, or the eigenelement of A, corresponding 
to the same eigenvalue. Suppose that yu; is an eigenvalue of rank h and that 


Hy = Mey, = «++ = bys py. Now, in view of what has been said, we must have 
k+h-1 
A,xj= DS cists 
s=k 


(G=kk+1,...,k+h—1) 
and 
Cis = (A; Lj, Ls) = (Xj, A,X;) = Cj, 


i.e. the cj, form a finite Hermitian matrix. We can reduce this matrix to the 
diagonal form by means of a unitary transformation of the x, (which makes no 
essential difference), and hence we can write, on retaining the previous notation 
for the elements: 
Ax; = bye; Aa; = Vir; 
GG=k, k+1,...,k+h—1), 


where some of the numbers »,, or even all of them, may be zero. We can carry 
out this operation for all the non-zero eigenvalues of A,. After this, all the non- 
zero eigenvalues of A, may not be obtained. If we take the eigenvalues that 
have not been obtained and carry out an operation similar to the above, pro- 
ceeding from A, and passing to A,, we finally get a finite or denumerable set 
of elements y, (k = 1, 2, ...), orthogonal in pairs, normalized, and such that 


Ayyy = BDyg; Aste = HD Yes 
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where at least one of the two real numbers wi) and y\2) is non-zero, whilst every 
eigenelement of A, that corresponds to a non-zero eigenvalue is linearly 
expressible in terms of a finite number of y;, and similarly for A,. We also 
obviously have 


Ay, = (HQ) + Qt) yxs A*y, = (uP — wPM) yy. 


Suppose that 
w= Ay = Ay + tAny. 


Avy = > Que and A.y = a DeYns 
k k 


We have [136]: 


so that any element z expressible in the form Ay can be expanded in elements 
Ure ‘ 
w= Ay = » (ay + dgt)yg- 
k 


Notice that, if say py) = 0, the term containing y, will be absent in the expan- 
sion of A, y. We saw above [155] that, if an operator A is a function of a self- 
conjugate operator B, A* is a function of B, so that A and A* commute, i.e. 
A is a normal operator. Thus any function of a self-conjugate operator is a 
normal operator. The converse is also true: every normal operator is a function 
of a self-conjugate operator. For, let A = A, -+ 7A, be a normal operator. 
The self-conjugate operators A, and A, commute, so that, as remarked in 
[156], they are functions of the same self-conjugate operator B: A, = F,(B) 
and A, = F,(B). On forming the function F(A) = F\(A) + iF,(A), we get A = 
== F(B), which is what we wanted to prove. 


159. Auxiliary propositions. The present and subsequent sections are devoted 
to proving the fundamental theorem of [142] and the fact that, if an operator 
commutes with a self-conjugate operator A, it also commutes with its spectral 
function €, for any 4. We can make use of the results which were obtained 
prior to [142], when developing the proof. Certain supplementary lemmas 
first need to be proved. 

Lemna 1. If A and B are commuting self-conjugate operators, satisfying the 
relationship 

A? = B?, (315) 


and P is the projection operator into subspace L, formed by the elementa x that 
satisfy 
(A+ B)x=0, i. ec. Ax = — Bao, (316) 


the following properties may be proved: 
(1) of an operator D commutes with (A + B), it also commutes with P; 
(2) of Ax = 0, then xe L, ie. Pr =z; 
(3) the operator A can be expressed by the formula 


A=(E—2P)B. (317) 
1. We have by hypothesis: 
D(A+B)=(4+B)D. (318) 
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If « ¢ L, by (316), D(A + B) x =0, so that (A + B) Dx =0,ie. Dx EL 
also. Let z be any element of H; then Px ¢ Land, by what has just been proved, 
PDz ¢ L, so that we can write for any element z of H: PDPz = DPz, i.e. 


PDP = DP. (319) 


On passing to conjugate operators in (318) and recalling that A and B are 
self-conjugate, we get [124]: 


(A + B) D* = D* (A+B), 
i.e. D* also commutes with A + B, and we can write (319) for it, i.e. 
PD*P = D*P. 


On passing to conjugate operators in this equation, and noting that P is 
self-conjugate, we get PDP = PD. Comparison of this equation with (319) 
gives DP = PD, i.e. D in fact commutes with P, as we wished to prove. In 
particular, A and B commute with (A + B) by hypothesis, so that A and B 
commute with P. 

2. It follows from the equations 


|| Az ||? = (Az, Az) = (A%z,z); || Bz ||* = (Bz, Bz) = (Bz, z) 


and condition (315) that |} Az || = || Bz || for any element z. If Av = 0, Bx = 0 
also, so that x satisfies (316), ie. 2 € L and Px = 2, which is what we had to 
prove. 

3. Using (315) and the fact that 4 and B commute, we have (A + B) 
(A — B) = 0, ie. if z is an element of H, then (A — B)z ¢ L, so that P(A — 
~— B)z=(A — B)z, ie. 


P(A—B)=A-—B. 
Given any element z, the element Pz ¢ LZ, so that (A + B) Pz = 0, ie. 
(A+ B)P=0. 


On subtracting from this the previous equation, and noting that A and B 
commute with P, we get 2PB = —A + B, whence (317) follows; the lemma 
is proved. 

Lemma 2. If the self-conjugate operator C > 0, and the self-conjugate operator 
F commutes with C, then F?C = CF? > 0. 

Using the notation Fx = y, we can write by hypothesis: 


(OF? «, 2) = (FOF, x) = (OF a, Fx) = Oy, y) > 0. 


which proves the lemma. A particular case is worth noticing. If the projector 
P commutes with C, we can say that PC > 0, since P? = P. 

If P(t) =a, + a,t+ ... +a, ¢t" is a polynomial and A is an operator, the 
polynomial can be associated, as we have seen, with the operator P(A) = 
=a,H+a,A+... +a,A". If A is a self-conjugate operator and the coef- 
ficients a, are real, P(A) is also self-conjugate. We shall need two further lemmas 
before investigating the properties of operator polynomials. 
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Lemma 3. If the polynomial P(t) is positive in the interval [0,1], for all 
sufficiently large values of p it can be writien in the form 


P(t)= x ct* (1 — t)P-S, (320) 


s=0 


where all the coefficients c, are positive. 
This follows for first degree polynomials from 


P(t) =c,(l—t)+ et, wherec, = f (0) and c, =f (1). 


Let us take a positive second degree polynomial, that does not split up 
into real first degree factors: 


P (t) =a+ 2ft + yt? (a>0,y>0, ay— f? > 0). 
Using the formula 
[a—4+¢4'= dota oy iPad, 


we can write the polynomial as 


P(t)=a SO (1—t t)P- * 426 3 Of: sf ampa ¢ one) ame 


s=0 


Pp 
+ yl > Opa? (1 — t)P$, 


s=2 


or, on collecting like terms, 


P(t) = - 2 eS OL —0P [pp Yat 26(p —1) 8 +8 (@—1) 9}. (821) 


The expression in square brackets is positive for all real s and for all sufficiently 
large p. For, the discriminant of this quadratic form in s: 


p (p — 1) ay —-7- (2p8 — 28 — y)? = 


= p* (ay — B*) + p (26" + 2By + a) — > (26 + 7), 


is positive for all sufficiently large p in the case ay — f? > 0. Hence (321) 
in fact leads to (320) with positive c, for all sufficiently large p. Let us now 
take any positive polynomial in the interval [0, 1]. It can be written as the 
product of positive polynomials of the first degree and positive second degree 
polynomials with imaginary roots. We have an expansion (320) for each factor. 
We thus get a similar expansion for their product, the degree p being equal to 
the sum of the degrees of the individual factors. 

Note. We can use the change of variable ¢, = (¢ — a)/(b — a) to reduce 
any finite interval a < ¢ < b to the interval 0 < ¢, < 1, and the following 
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formula is obtained instead of (320) for polynomials positive in the interval 
[a, b]: 
P § —s$ 
P(t)= 2, c, (¢ — a)* (b —2t)P-S. (322) 


s=0 


Lemma 4. If m and M are the bounds of a self-conjugate operator A, i.e. of 
the quadratic functional (Ax, x) with || x || = 1, and P(t) is a polynomial, non- 
negative in the interval [m, M], P(A) must be a positive operator, i.e. 


(P(A) z, x) > 0. (323) 


It is sufficient to prove the lemma in the case when P(t) > 0 in the interval 
[m, M]. For, suppose that the lemma is proved in this case, and that Q(t) > 0 
in [m, M]. On putting P(t) = Q(t) + «, where «> 0, we have P(t) > 0 in 
[m, M], so that, by the above: 


((Q (A) + €) 2, 2) = (Q(A) 2,2) + € (2,2) > 0. 


On passing to the limit as e— 0, we get inequality (323) for Q(A). Let us 
turn to the proof for positive P(t). By (322), with a = m and 6 = M, it is suf- 
ficient to show that the operator 


(A — mE)’ (ME — A)P-S (324) 


is positive, the number p being taken as odd (the sum of positive operators is 
positive). Suppose say that s = 27 is even, and let (324) be written as 


Aj A;, 
where 
p-2j-—1 


A,=(ME— A); A4,=(A—mE)(ME—A) ?  , 


A, commutes with A,, and A, is a positive operator, since (A,z, x2) = M — 
— (Az, x) > 0 for || «|| = 1. By lemma 2, we can say that operator (324) is 
positive. When ¢ is odd, we have to take A, = (A — mE). 

CoroLuaRyY 1. If the polynomials P,(t) and P,(t) satiafy P,(t) > P,(t) in 
the interval [m, M], i.e. P,(t) — P,(t) > 0, then P,(A) > P,(A). In particular, 
if | P(t)| <6, io. —e < P(t) < ¢, then —cH < P(A) < cH, io. —ex< 
< (P(A) x, x) < e for || v|j| = 1, so that the norm of P(A) is not greater 
than e [126]. 

CoroLuaRy 2. It follows from the previous corollary that, if a sequence of 
polynomials P,(t) tends uniformly in the interval (m, M] to a polynomial P(t), 
then P,(A) + P(A), and the norm of the difference P(A) — P(A) tends to zero. 


160. Power series of operators. Let us recall the lemma proved in [131], 
the result of which amounts to the following: if the norms of a sequence of 
operators A, (n = 1, 2,3, ...) do not exceed the positive numbers 6,, which 
form a convergent series, the series 
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is convergent, and the norm of A does not exceed the sum of the numbers 4,. 
In particular, if we have the power series 


> 4% it 

n=0 
which is absolutely convergent in the interval |¢|< k&, and the norm of the 
operator A does not exceed k, then 


oD 


> a,A" 


n=0 
is convergent. 
We shall require below the following binomial formula: 


Vly+t= 5 [=| iii; (325) 
where een 
elegee ne las as 


Formula (325) gives the arithmetic value of the radical and remains valid 
with ¢ = +1 [I; 138]. The coefficients of expansion (326) are positive for odd 
and negative for even n > 0. Hence, on putting ¢ = —1 in (325), all the terms 
except the first become negative, whence it follows that 


ae 
FI” Slt] 

n n 
and series (325) is absolutely and uniformly convergent for |¢|< 1 [I; 146]. 
On replacing ¢ by # — 1 in (325), we obtain on the left the absolute value of 


the square root of #, i.e. the absolute value | ¢|, and the following expansion 
is obtained for it into an absolutely convergent series in the interval |¢| < 1: 


o > =, (327) 


n=l 


on1- § 
n=1 


co f ] 
l¢|= > Pal — 1)". (328) 
n=0 


n 


This expansion may be applied to a self-conjugate operator. Let A be self- 
conjugate, with norm m,. We form the self-conjugate operator C = (A?/m4) — 
— EH. We have 


1 ] 
a 2 = 2, 2 2 
(Cx, a) = <e- (A%e, 2) — [lel = Sell Al — les 
whence it is clear that —1 < (Cz, x) < 0 for || || = 1, whilst the norm of 


C does not exceed unity. The series can be formed: 


sea 1 pad 1 ] 7 n 
=0| ), | 


n=0| » 
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If S,(t) is a segment of series (328), S3(t) + ¢ uniformly in the interval 
[—1, +1], so that $2(A/m,) + A*/m4, and in the limit the self-conjugate opera- 
tor B, given by (329), satisfies B? = A?. 

Further, if the operator D commutes with A, it commutes with the segment 
of series (329), and hence commutes with B in the limit. It follows from this, 
in particular, that A commutes with B, ie. AB = BA. 

Let us show further that B is a positive operator. Recalling that the norm 
of C is not greater than unity, we get | (C" 2, x)| < || ||*, and, after writing 
(329) in the form 

Bm mal E+ x Hal 
n=1| > 
we arrive at the inequality 


(Baal Sos Go > 


1 
H 
n 
whence it follows, by (327), that (Bz, x) > 0. 

Thus the following properties are finally obtained: B is a self-conjugate 
positive operator, commuting with A and satisfying the equation B? = A?; 
every operator that commutes with A also commutes with B. We shall use the 


operator B and lemma 1] in the next section to form the spectral function €, 
of the operator A and thus prove the fundamental formula (204) of [142]. 


2m, t= 
n=1 


161. The spectral function, THEOREM. Given any self-conjugaie operator A, 
there is a corresponding projector € , with the following properties: (1) ¢f an operator 
D commutes with A, it also commutes with €,; (2) if Az = 0, then €,2 = 2; 
(3) the self-conjugate operators Aé, and A(E — €,) satisfy the conditions 


A&, <0; A(E—é,) >0. (330) 


We take as @, the projector P of lemma 1. If D commutes with A, it also 
commutes with B, and hence with (A + B), and the first two statements of 
the theorem follow from Lemma 1. Further, it follows from A = (H — 2é,) B 
and the fact that €, commutes with A and B, and é? = @,, that 


AE, =— Bé,; A(E—6,)=B(E—6,). 


But the products of the positive operator B with the projectors €, and 
(ZH — &,), with which it commutes, are positive operators, and inequalities 
(330) follow at once from the last formulae, and the theorem is proved. 

Let 4 be any real number. We can form for the self-conjugate operator (A — 
~— AE) the projector mentioned in the last theorem. Call it €,. It has the follow- 
ing properties; (1) if an operator D commutes with (A — AZ), or, what amounts 
to the same thing, with A, it commutes with &,; (2) if (A — AE)z = 0, then 
€,2 = 2; (3) we have the inequalities 


(A —AE)6, <0; (A —AE)(E—&) > 0. (331) 
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Notice also that, given any A, €, commutes with (A — AZ), i.e. with A. 
Let us show that ¢, represents a resolution of the identity. Every ¢,, commutes 
with A, and hence, by what has been proved, with any ,,. Let A <m. We shall 
show that ¢, = 0 here. If this were not the case, we should have an element 
x with unit norm such that ¢, « = a, so that 


((A — AE) &2, 2) = ((A — AB) x, x) = (Ax, x) —2>0, 


since 4 < m, and this contradicts the first of inequalities (331). Thus ¢, = 0 
for 4 < m. Similarly, by using the second of (331), it can be shown that €, = # 
for 41> M. It remains to show that ¢, < @, for 4 < yw, ie. that 6,6, = &, 
for 4 < yp; or, what amounts to the same thing, we have to show that 


é,(# —6,) =0. (332) 
Let us write R for the left-hand side: 
6, (E—€,) = (B— 6,4) & = BR. (333) 


We want to show that, given any element x, we have Rx = 0. Let us write 
Rx = y. It follows at once from (333) that 


6,R = €](E—€,)=8,(E—&,) =F and similarly (E—&,)R=R. (334) 
We have by (331): 
((A — AE) &y,y) <0; ((A — HE) (E—8,)y,y) > 0. (335,) 
On the other hand, by (334), 
Sys 6,Ra= Rer=y; (E—é,)y=(E—é,) Ra = Rr=y, 


and tho first of inequalities (335) can be rewritten as ((A — AE) y,y) < 0, 
and similarly, the second can be rewritten as ((A — “E)y, y) > 0. On subtract- 
ing the last from the previous inequality, we get ((u — A)y, y) < 0, ie. (uw — 
— A)||y ||? < 0, whence it follows, since 4 < yw, that y= 0, ie. Rx = 0, 
and (332) is proved. We shall prove later the continuity of ¢, from the right. 

We require an inequality before proving the integral form of the operator 
A in terms of ¢,. We bring in the projector 


A=68,—-&  (u>A)j (336) 
we can write for any element z: 
((A — #E) é Az, Az) < 0; ((A — 4B) (E — &)) Az, Az) > 0. 
On using the obvious equations 
4=A, ¢,4=(E—-&)4=4, 
we can rewrite our inequalities as 


((A — E) 4z,2 <)0;  ((A —4E) Az, 2) > 0 
or 
A (Aa, 2) < (A Az, x) < pw (Az, 2). 
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This gives us, on taking any number v that satisfies A< v < yp: 
|((A — vE) Aa, 2) | < (u — A) (42, 2) 
or, since (Ar, x) = || Ax ||? < || z|I?, 
|((A — vE) Ax, x)| < (u — A) || v||*. 


It is clear from this [126] that the norm of the operator (A — »EH) A does not 
exceed (f — A), i.e. 


\| (A — vB) Ax || < (« — A) |[ |]. 


On replacing x in this inequality by Ax and noting that 4* = A, we get an 
inequality fundamental for what follows: 


| A de —vAz|| < (u —A)|| Ae}. (337) 


In this inequality, A< »< pw, and A is defined by (336). We turn to the 
proof of (204). We take a positive number e, and split the interval [m — ¢,, 7] 
into sub-intervals: 

M—mH=A<KA SAK... <A <A, = M; 


? 


then we introduce the projectors 4, = ¢;, — &4,_,, where 
n 
E= SA, aud 4,4, = 0 for k #1. (338) 
k=1 


Any element 2 can be expanded in mutually orthogonal terms: 
n n 
z= Aye = > Ts 
k=1 


n 
Ar= » Axy. 
k=1 
It may easily be seen that 
(Aa,, Aa) = 0 and (Aa, 2) =0 for k #1. 
For instance, the first of these equations is proved thus: 
(Azy, Ax) = (A A,z, A A,x) = (A AA, Az), 
whilst the last expression vanishes, by (338). We now form 
ft n 
Arz— 3S Aye = SY (Arg — 142), 
k=l k=1 


where » is any value from the interval [A,_,, 4,]. The terms of the sum on the 
right are mutually orthogonal, and we can write, on using Pythagoras’ theorem: 


nm n 
|] Axe — SM Ape ||? = dF || Aoy — ry |I?- (339) 
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Let 6 be the greatest of the differences 4, — A,_,. If we use inequality (337), 
we get from (339): 


n n 
|| Ax — DS Ape |i? < FS |i ey |, 
k=1 
or, by Pythagoras’ theorem: 


|| Aw — 2 y,Aye ||? < 82 | @ 
k= 


whence it follows that, for any element x, as 6 + 0: 
n 
Az = lim > Vp Apa. 
k=1 


We thus arrive at the basic formula 


M 
A=Jid&. 


m 


It remains to show that @, is continuous from the right. As uz tends to A, 
the projector 4 defined by (336) does not increase, and tends to some limit 4); 
we have to show that 4, is the annihilation operator. On passing to the limit in 
(337), we get (A — AH) A,x = 0. Hence it follows, in view of the second 
property of €,, that €, 4,2 = Ayx, ie. (EH — €,) 4g x = 0. On the other hand, 
(EZ — &,) A= A, and we obtain on passing to the limit: (H — &,) 4d) = do. 
Since (E — &,) Ayx = 0, we get A,x = 0, ie. A, in fact annihilates any ele- 
ment 2. Notice also that any operator D that commutes with A also commutes 
with 6). 


§ 2. Spaces /, and L, 


162. Linear operators in /,. We shall now apply the general theory 
to spaces 1, and [,. We have already seen that, by choosing some 
closed orthonormal system in H, a one-to-one correspondence is 
obtained between elements of the abstract space H and l,. Naturally, 
l, can be regarded on its own as a concrete form of H, since all the 
axioms of H hold, given the usual definitions of algebra and of scalar 
product in J. 

The concept of cut-off element [cf. 134] must be introduced. Let 
2(&, &,...) be an element of 1, and ae, en dG ep Os Oh we) 
have the same first & components as 2, its remaining components 
being zero. The element x“ is called the cut-off of 7. We have 


oo 


JIc— eX]? = IS |&|?>0 as k+oo, (1) 
m=k+] 
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ie. 2 >a as m—>oco. Let 9, q,... denote the base vectors 
of 1,, i.e. & = 1 for gy, and the remaining components are zero. We 
have for an element 2: 


t= Sinn: (2) 


If A is a linear operator in J, and x’ = Az, we have, on introducing 
the components 2’(¢;, &, ...): 


A 3 


En os (z’, Pn) = anm Ems (3) 
m=1 
(n= 1; 2, ) 
where 
m= (A Pm Pn) (4) 


A linear operator in i, can therefore be represented by a matrix 
with elements (4). The matrix corresponding to the conjugate operator 
A* is [cf. 134]: 


(a® Pm» Pn) = (Pm A@,) = Omn- (5) 
A self-conjugate operator is characterized by the equation 
Onm = Omn: (6) 


We have for a bilinear functional: 


(Ax, 9) =(2, Aty) = S| Som snln= > En( Sumi) (7) 
= m=1 inn n=1 
where y has the components (7,, 1, ..-). 


On forming the cut-off elements x) and y” for z and y, we have 


{ k 
(Ax, y) = SS dam Emin 


n=l m=1 
But (Az, y) - (Az, y) as & and 1-» ©, so that 
ee: 
S| 2 tonne = fim > Aim inTe- (8) 
n=l \m=1 n=l m=1 


If a,, and b,, are the elements of the matrices corresponding to 
operators A and B, the matrix for the operator D = BA is dp, 
defined by 


dng = (D Pq Pp) = (BAGg Pp) = (Aq B* Pp) » 
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or, using the formula for a scalar product in 1,: 


oo 


(Agg, 5) (B* Gps Ps) = > eq bS- (9) 
=l 


s= s=1 


On taking (5) into account, we finally obtain 


If we use the same letters A and B for the infinite matrices as for 
the operators, and write {A},, and {B},, for the elements of these 
matrices, the above formula can be written as 


{BA} oq = {8 }vs{ A} sa: (10) 


Given three linear bounded operators A, Band C, the associative law 
(CB)A = C(BA) can be used to write the following formula for inter- 
changing summations: 


SCS (hos (Bla) {Aho = S (Choe (SB) fe). 0 


s=l s=1 


163. Bounded operators. As we have seen, every bounded linear 
operator generates an infinite matrix @,,. Let us pose the converse 
problem: what sort of elements a,, must an infinite matrix have in 
order for (3) to yield a bounded linear operator in J,? We shall require 
that series (3) be convergent for any element (&,, &,...) of 2, and 
that there exist an N such that 


>| DS anbil <N? S42. (12) 
n=l k=1 k=1 


for any element z € J,. 
Remember that, in the case of a bounded operator A we must have 
for a bilinear functional: 


| (Az, y)|<N [ai -jy |). 


On applying this inequality to the cut-off elements, the following 
necessary condition, containing finite sums, is obtained for the apg: 


i k k I 
| > Seam bmTnf < N? SS lEnl? > > bral? (13) 
ml n=] 


n=] m=1 
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This condition is in fact sufficient as well as necessary, i.e. the fol- 
lowing theorem holds: 

THEOREM 1. The necessary and sufficient condition for pq to be 
elements of the matrix of a linear bounded transformation is that, given 
any positive integers k and 1, and any complex numbers &, and %p, 
condition (13) ts satisfied for some choice of the number N (independent 
of Ep, Ng & and 1). 

The necessity has been explained above. Let us prove the suffici- 
ency. Let (é,, &,...) be any element of J,. We put 7 =k in (13) 
and 


It now gives 


kk k kok 
(=| > Ann Em P< NF Dl Enl? | > Anmén |? 
n=l m= m=1 n=l m= 


or 
k 


S| Semin <¥* Seal 


n=l m=l1 


and all the more 


k k aD 
| > tam Sm |? < N? S len): (14) 
n=l m=l1 m=t1 


We shall show that these inequalities imply the convergence of 


Sam En (15) 
(n =1,2,...) 


for any element of /,. Suppose that the series is divergent for some 
choice of element (£4, é@, ...) and number n. In this case the 
series 


> lean $9, 


m=1 


is all the more divergent, and the finite sum of this series: 


K 
> (nm Sm 
m=1 


will increase indefinitely as k increases. We vary the arguments of 
the complex numbers € in such a way that the products dam gt) 
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are positive numbers. On applying inequality (14) to the element of 
i, thus obtained, and throwing away from the left-hand side all the 
terms except the one corresponding to the value of n mentioned, we 
obtain 


k 0° 
( An E)2 < N? 2 | 


The left-hand side increases indefinitely on indefinite increase of 
k, and we have arrived at a contradiction. Thus all the series (15) 
are in fact convergent for any element x. Let us now show that (14) 
implies the inequality 


>| Sam én 


n=l m=1 


2< Nt SE. (16) 
m=1 


In fact, if we were to have the reverse inequality for a certain k 
and a certain element 2 of l,, we should have for this k, and sufficiently 
large J (it can obviously be assumed that J > k): 


k ! 20 
|S tam ém|) > N? & Emi? 


n=l m= 


and all the more 
I I 


| Gam in| > N?  |Enl*, 


n=l m=1 
and this contradicts (14) with k = 1. Inequality (16) is thus proved. 
On letting & increase indefinitely in it, we arrive at (12), and the theo- 
rem is proved. 

Note. Notice that we only used (13) in the case 1=k when 
proving its sufficiency. Let us show that it is sufficient to verify this 
condition only for quadratic forms, i.e. the sufficient condition for 
the operator to be bounded is that, for any k, 


k i" k 
| Gam bmEn| <N & [Em (17) 


If we use (17) and the formula expressing a bilinear functional in 
terms of the corresponding quadratic form, we can write 


k k k 
£2 N 
| SS Gam Em nl <- [|S + tm |? + | Gm — Mm |? + 
m,n=sl m=l1 m=1 


k k 
+ aX | Em + hm | +e | Em — Pm] 
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Using [a + 6? < 2([a /?+ | 8/7), we obtain, on the assumption 
that the norms of x and y are unity: 


k 
| > Gnm Em ln| <4N , 
m, n= 


whilst for elements with any norm: 


k k 1 ok 1 
| Aum Sm tin| < AN [S| Em PPLE | om PP 

m,n=1 m==1 m=1 

i.e. condition (13) with 1=k follows from (17), i.e. the operator 
defined by the matrix a,,, must be bounded. Some further facts must 
be mentioned in connection with condition (13). If aj, satisfy condi- 
tion (13), the elements of the matrix of the conjugate operator Aj, = 
= a, obviously also satisfy this condition, as must in fact be the 
case in view of the general theory. We must also consider the matrix 
of the transposition operator and the matrix of the complex conjugate 


operator: 
{A’} ng = gp {Ang = Ang ; (18) 
We obviously have 
A* =(A)'=A’, (19) 


and the elements of the matrices of operators A’ and A obviously 
satisfy condition (13), provided it is satisfied by the elements of the 
basic matrix A. It follows at once from (13) that all the a,, must be 
bounded in modulus by a number independent of p and q; in fact, 
on setting , = ny, = 1, and the remaining £,, and , equal to zero, 
we get | @,,| < N. There is also a necessary condition that must 
be satisfied by the elements of the matrix of a bounded transformation. 
It follows from (4) that the a,, are the components of the element 
Ag,. The series formed by the squares of the moduli of the elements 
of any column must therefore be convergent: 


Say |? < +0 (20) 
n= 
(an ae re 


On passing to A*, it will be seen that the same must be true as 
regards the rows: 


> [Aun 2 < + 02 (21) 
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Let us notice a simple sufficient condition for the transformation 
corresponding to the matrix a,,, to be bounded. 

TorormeM 2. If there exists a positive number 1 (not depending on 
m and n), such that the inequalities 


Zeal <i (22) 2 lon <t (28) 
m=1 fic 
(x= 1, 2,...) (m= 1, 2,...); 


are fulfilled, the matrix Qnm yields a bounded transformation. 
It is sufficient to show that, when || xz || < 1 and || y {| < 1, the 
sum 


8 = SS ler || Emil ta (24) 


is bounded. In this case, (13) will all the more be satisfied. On observ- 
ing that |ab| < +(|a?+|6/%), we can write 


8 <4 Slasnl (Ent? + nal?) = 


n=l mal 


= S[ nl? S lonl] + gS alt lll 


m=1 Neal] 


or, by (22) and (23), 


i ae I 
S< > DlFnl? + re lrtnl® = > ((l2l!? + lal?) < 2. 
N= n= 


and the theorem is proved. In the case of a self-conjugate matrix, 
(23) follows from (22). Notice that series (24) is not convergent for 
every bounded matrix. 

As an example, let us take the two infinite matrices: 


0, 0, 0,... a,, 1, 0, 0, 0,.. 
T.105°0;-03 Be: OOF O54 
A NAO is B='q,, 0, 0, 1, 0,...}? 
0, 0, 1, a4, 0, 0, 0, 1,. 


where the a, form a sequence of real numbers such that the series with 
common term | a, | is convergent. Using Theorem 2, it is easily shown 
that linear transformations correspond to the matrices A and B. 
It is also easy to show, by using (9), that BA = E for any choice of 
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a, (satisfying the above condition). Thus a linear operator A has an 
infinite set of inverse bounded operators from the left, and hence no 
bounded inverse from the right. If we pass to the conjugate matrices, 
i.e. since the a, are real, simply to the transposes A’ and B’, we get 
A’ B’ = E. The equation Av = y has the form & = 7; & = 15; ... 
and, given any y € /,, it has a unique solution in J,. The equation 
A’x =y has the form &, = ; & = m; .-., and it has an infinite 
set of solutions, since &, is arbitrary. This is bound up with the fact 
that A’ has no bounded inverse from the left. 


164, Unitary matrices and projection matrices. Let us recall the 
fundamental property of a unitary transformation U: 


OF C=TUT=£: 


If up, are the elements of the matrix corresponding to U, these 
conditions can be written as 


“bs Usq = Spgs > Ups Uhy = Spgs (25) 
s= S=1 


where 6,, = 0 for p # gand 6,, = 1. Noting that uit, = unm, we can 
write these last equations as 


> Usp Usq = 9pq3 (26) 
s=1 
> Ups tas = 8 pq: (27) 
s=1 


i.e. the matrix Up, is found to be orthogonal with respect to its columns 
and rows. Notice that, in the case of finite matrices, conditions (27) 
are consequences of (26) [III; 28]. These conditions are independent 
in the case of infinite matrices. 

THEOREM 3. Conditions (26) and (27) are necessary and sufficient 
for the complex numbers Up, to form a matrix corresponding to a unitary 
transformation. 

The necessity of (26) and (27) follows from the above discussion. 
It remains to prove their sufficiency, i.e. we have to show that, given 
these conditions, a linear (bounded) transformation corresponds to 
the matrix #p,. The fact that this transformation is unitary follows 
afterwards, from the fact that conditions (26) and (27) are equivalent 
to (25), these latter being characteristic of a unitary transformation 
[137]. 
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Condition (27) shows that 


> Mp" = 1, 
s=1 


so that the series 


é = Dd tn & (28) 
k=l 


are convergent for any element 2 [59]: 
We form the expression 


n 


S| nb P = >> S4 tp 6, &= S3( 35% igs) 5 &- 


p=! g=l p=ls=l1t=1 S=lt=1 p=l 


On letting m increase indefinitely, we find, by (26), that 


= S| Sel = S60 


and all the more 


where m is any fixed finite number. This inequality still holds with 
n = 00, as follows by assuming the contrary, precisely as in the proof 
of Theorem 1 of the previous section. If we then let m increase in- 
definitely, we arrive at the inequality 


PAP < S16 ie (29) 


which shows that the operator U is bounded. Notice that, since U is 
unitary, we must have the sign of equality in (29). 

Let us now consider the matrix pj, corresponding to a projector 
P onto subspace L. On recalling that P is a self-conjugate operator, 
and that P? = P, we get the following conditions: 


Pri = Pits 2 Pis Ps = Dik: (30) 


It can be shown, precisely as above for unitary transformations, 
that these conditions are sufficient as well as necessary for the matrix 
Pi, to correspond to a projector. We choose a closed orthonormal 
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system so that part of the base vectors form a closed system in the 
subspace L, and the other part a closed system in the complementary 
subspace H — L. The former base vectors remain unchanged as a 
result of application of the projector P, whilst the others are annihil- 
ated. Thus, given the chosen system of base vectors, the projector P 
corresponds to a purely diagonal matrix, the principal diagonal of 
which consists only of ones and zeros. In other words, the matrix of 
a projection is the unitary equivalent of a purely diagonal matrix, 
the principal diagonal of which consists only of ones and zeros. 


165. Self-conjugate matrices. A self-conjugate matrix A is charac- 
terized by conditions (13) and (6). The eigenvalues and eigenelements 
of such matrices are defined from the condition that the infinite system 


ait Ue (31) 


has non-zero solutions in 2,. If the eigenelements y, form a closed 
orthonormal system, and they are taken as the base vectors, the 
matrix corresponding to the operator A will have the elements 


0 for p¥q, 


32 
A, for p=q, 32) 


Qng = (Ay, Pp) as Ala Pp) _ 
i.e. we get a purely diagonal matrix with the numbers A, on the 
principal diagonal. In general, the necessary and sufficient condition 
for a self-conjugate matrix to have a purely point spectrum is that it 
be the unitary equivalent of a purely diagonal matrix. In the above 
case, when the y, are the base vectors, we have 


(Az, y) = 2 A, e N53 (Ax, x) — 2A, | é, |? (33) 


In the general case, given a self-conjugate matrix aj, there exists 

a resolution of the identity %,, i.e. a non-decreasing projection matrix 
li,(A), such that 

0 for i#k, 

~ 1 for i=k, 


(i,k =1,2,...); 


Ex(a) = 0; Ui (B) (34) 
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and we have the formulae: 


oo b b oo 
= ai &, = (Az, 9,;) = f ad(F,,z, Pi) = fad (2 Uig(4) é,) ’ 
i.e. 


b 
= § Adl,,(4) (35) 


(Az, y) = fad ( 2 I(A) & Tis); 
(36) 


(Az, x) = fad( = L(A ) &,&,). 


We have here, by the properties of resolution of the identity 


> lis(As) Usp Ag) = a> Lig Ag) Esu(Aa) = Uy(Ay) for 4y<a, (37) 


s=l $=1 


and in general 


co 


2 Ay 1,,(A) «Ay Ugg(A) = 4y Ao LiglA) , (38) 


where the right-hand side represents the difference in the values of 
1y,(4) at the ends of the interval 4, A,, this latter being the intersection 
of the intervals 4, and A,. If f(A) is a non-decreasing function in the 
interval [a, b], the operator f(A) corresponds to a matrix with elements 


(Alin = a {7a ) diy (A (39) 
We can write [43]: 
co Ob db 
& Shi Adhis(4)- { fa (A) day (2) = fh (A) fa(A)dlix(4). (40) 
ala os : 


Noticing that the bilinear form 


(Fy &, y= ea Le, (A) g, Ns (41) 

S,te=1 
is a function of bounded variation of 2, we can say that the functions 
I(A) are of bounded variation in 4. When y = a, (41) yields a non- 
decreasing function of A, and it follows at once from this that the 
functions /,,(A) are not decreasing. If (39) are understood as Lebesgue— 
Stieltjes integrals, this formula is applicable for the wide class of 


494 HILBERT SPACE [166 


functions /(A), which we indicated in [155]. It is sufficient to assume 
that f(A) is bounded and is a B function [47]. In this case it will be 
measurable with respect to any non-decreasing function. In the case 
of a purely continuous spectrum, all the functions J,,(A) are continuous. 
The converse is also true. In the case of a mixed spectrum, we consider 
the subspace Z, in which the operator A has a purely point spectrum, 
and the complementary subspace H — L, in which A has a purely 
continuous spectrum. Let us introduce closed orthonormal systems 
into these subspaces. On writing (&{, §, ...) for the elements in L, 
and (£7, 63, ...) for the elements in H — L, we can write the bilinear 
and quadratic forms as 


oo b oo 
tise Mi = Sn Sette + J AA( OS Us (A) 7s 
k a §,t=1 


i,k=1 


oo = 7] oo oe 
> Fin $e Fi = On ERP +L Ad (OS Le (A) Et 3) 
k a s,t=1 


i,k=1 


(42) 


where the 7,,(A) are continuous. 
Let us consider the resolvent of the matrix A, i.e. the matrix with 
elements {R(A)}i,, defined by 


{R(A)} ix = {(A ss AE)“\ 4. 
We have 
b 
— f dix (#) 
{R(A)}e = [Se (43) 
on the assumption that A does not belong to the spectrum of A. 
Notice that, by (39), positive integral powers of A can be written as 


b 
{A™} i, = (Am dl; (a). (44) 


If A = 0 does not belong to the spectrum of 4A, i.e. all the 1;,(A) are 
constant in some neighbourhood of 4 = 0, there exists a bounded 
inverse matrix A~}, and powers of it are given by 


b 
{Ao}, = § AM dl, (A). (45) 
166. The case of a continuous spectrum. As we know, a further 


dissection of H — L can be carried out, into subspaces invariant 
with respect to A, in each of which A has a simple continuous spectrum. 


166] THE CASE OF A CONTINUOUS SPECTRUM 495 


Let H, be a subspace of this type. We introduce a closed orthogonal 
system into it and take our future remarks to refer to H,; we can 
regard H, as a Hilbert space, into which the base vectors 9), 9, .-. 
are introduced, so that every element x is defined by its components 
(,, &, ...). Let # be an element of H,, such that the closed linear 
envelope of &, x is the whole of H,, where a < 4 < b. Let p,(A) denote 
the components of the element J, 2. Any element (7, 7, ...) of H, 
is associated with the function 


Py (A) = (y, 2) = >7,0) (46) 


kal 


Using formula (259) of [147], we can write a bilinear functional in 


the form 
zs POS A med Sr 
where ° 
A) = 18,21? = Sip. (47) 


so that, given our system of base vectors, the elements of the matrix 
defining the transformation A in H, are 


ae dp,(A) dp,(A) 
me -f We dpiay de) ee) 


Further, given any element y of H,, there is a corresponding function 
y(A) of L, with respect to (A) in the interval [a, b] such that 


ao 


Aa 
py(A) = > D(A) m = : y(t) dol) , (49) 


k=1 


and conversely, given any function y(A) of L,, there is a corresponding 
definite element y of H,. Norms and scalar products are preserved in 
this correspondence. If we let g,(A) denote the function y(A) correspond- 
ing to the base vector ¢,, we obtain 


A 
D(A) = J (Ht) do(s) , (50) 


and the g,(A4) (k= 1, 2, ...) form a closed orthonormal system 


496 HILBERT SPACE [166 


in Z,. The operator A in H, corresponds to multiplication by A in L£,, 
and instead of (48), we can write for a;, = (Agx, 9;): 


—— 


b 
Cin = J Ag,(a) P(A) dod) . (51) 


Suppose that we took, instead of the 9,(A), a complete ortho- 
normal system p,(A) in L,, with some complete system of base vectors 
y, in H, corresponding to these y,(A) (k = 1, 2, ...). Let us introduce 
a unitary transformation U into H,, transforming the base vectors 9, 
into base vectors y,, ie. Ug, = yy. This unitary transformation U 
in H, will correspond to some matrix, which itself depends on the 
choice of base vectors. If we choose the gy, or y, a8 base vectors, we 
get the same matrix with elements 


Cin = (U Py, Pi) = (Pas Pi) » 
or 


Cin = (OP Yi) = (Ye OF ¥:) = (Ye UY) = (Yo Gi) « 


Since passage from H, to LZ, does not change the scalar product, we 
can write 


& 
cin = J wuld) 9,(2) dol) . (52) 


In the new base vectors, the elements of the matrix that corresponds 
to the operator A will be defined by formulae analogous to (51): 


b a 
bin == J Ap, (A) pA) do(A) . (53) 


If , (k = 1, 2, ...) are the components of an element in the base 
vectors 9, and & are the components of the same element in base 
vectors p,, then &, = (x, px) and &% = (x, py) = (x, Up) = (Ua, ox), 
whence it is clear that (&], &3, . ..) is expressible in terms of (£,, £3, ...) 
with the aid of the inverse matrix to cj. Thus, if we write A, B and C 
for the matrices with elements a;,, bj, and cj, we have the matrix 
equation 

B=CAC. (54) 

Using formula (263) of [147], together with (49) and (50), we can 

write 


“u ———— 


{(F din = bale) = J (A) P(A) do(A) . (55) 
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On recalling (39), we can write the expression for the elements of 
the matrix f(A) (in base vectors g,), where /(A) is any bounded B 
function, defined in the interval [a, b]: 


b 


{f(A)}ue= J HA) @x(4) pA) do(d). (56) 


a 


If f(A) = 1/(A — yu), we get the resolvent of the operator: 


b 


(Rw) = [ HAMA day. (57) 


If f(A) is a real function, f(A) is a self-conjugate operator, and we can 
write expressions similar to (55) for the elements of the spectral 
function Z/, of the operator f(A): 


{Pidin = :) (A) P(A) do(A) , (58) 


where C,, is the set of values of A, defined by f(A) < wu. We shall not 
dwell on the proof of this formula. The nature of the spectrum of /(A) 
will vary, depending on the properties of f(A) 

We started above from a given operator A, self-conjugate in H,, 
and a given element 2 such that the closed linear envelope of 2,2 
is H,. By introducing the base vectors g,, we arrived at J, and infinite 
matrices for which the above formulae hold. Conversely, we can choose 
any continuous function (A), non-decreasing in the interval [a, 6], 
and vanishing at 4 =a, and a closed orthonormal system 9;(A). 
After this, formulae (51) define elements a;, that obviously satisfy the 
condition aj, = d,;. It may soon be shown that the matrix with 
elements a@;, corresponds to a bounded operator in /,. For, on writing 
N for the greatest value of | A | in the interval [a, b], we have, by (51), 


m 


Gx (A) &, |? de (A) 
1 


6 
|S abet] <N I 


i, k=l ke 


or, since the (A) are orthogonal and normalized: 


| a kdl< w Set. 
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whence it follows that the corresponding operator is bounded. It is 
self-conjugate, since a, = a,;. Formulae (55) define the elements of a 
projection operator which depends on the parameter uw and is a 
resolution of the identity, where obviously, 


b 
a) J Ad {FE shins 


ie. &, is the spectral function of the operator A. If we pass to the con- 
jugates in (50), we get the components p,(4) of the element #, 2, and 
when j= b, the components of the element x itself. In view of the 
closure equation, it follows from (50) that o(A) is given by (47). In the 
general case of a self-conjugate operator A with continuous spectrum, 
we form pair-wise orthogonal invariant subspaces H,, H,, ... in each 
of which A has a simple spectrum. If we bring in the base vectors for 
each A,, we obtain formulae of the above type for each H,. The final 
expression, say for the bilinear form (Az, y), can then be obtained by 
addition of the bilinear forms in each of the H,. 

The concept of simple spectrum can readily be generalized, by 
renouncing the requirement that the spectral function @, be con- 
tinuous. But there must exist, as before, an element x such that 7, zx 
forms the whole of H. The non-decreasing function @(A) given by (47) 
is not necessarily continuous in this case. We can obviously take z to be 
a normalized element so that o(b) = 1 as well as e(a)— 0. If A has, say, 
a purely point spectrum and all the eigenvalues have rank equal to 
one, on taking z to be any element such that all its Fourier coefficients 
with respect to the closed system of eigenelements differ from zero, 
we can assert that 2,2 forms the whole of H, and the spectrum in 
question will be simple. It may easily be seen that the spectrum cannot 
be simple when multiple eigenvalues exist. We get the general case 
of a simple spectrum if, on splitting the whole of H into a subspace of 
eigenelements and a subspace with a continuous spectrum, simple 
spectra are obtained in both cases. The spectrum will be simple in the 
first subspace when and only when all the eigenvalues are simple. 
If a simple spectrum is not continuous, , has jumps, and the (A) 
given by (47) must also have discontinuities at the points where @, 
is discontinuous. For, if o(A) were continuous at a point A = 4’ where 
the spectral function &, is discontinuous, we should have (2; — #4_9)x 
= 0, and all the elements of the space formed by %,2% would be 
orthogonal to the eigenelements corresponding to the eigenvalue 
A = 1’, whence it would follow that #, x cannot form the whole of H. 


167] JACOBIAN MATRICES 499 


167. Jacobian matrices. Let A be a self-conjugate operator, having 
a simple continuous spectrum, in infinite-dimensional space H. On 
orthogonalizing powers of A with respect to 9(A) in the interval [a, 6], 
we obtain a system of real polynomials P,(4) (k = 0,1, 2, ...) of 
degrees k: 

{ P,(a) P,(A) dol) = {° for § ## (59) 
re 1 for i=k, 
as the closed system ¢g;(A) of the previous section. 

The first coefficient in each P;,(A) can be assumed positive. In the 
previous notation we enumerated the ,(A) starting from k = 1. We 
now enumerate the P,(A) starting with k = 0, since k denotes the 
degree of P,(A). Thus the P,(A) replace the 9,4,(A). Given our choice 
of base vectors, the elements of the matrix corresponding to the 
operator A are defined, in accordance with (51), by 


ay = [ APC ) P;, (A) de (A) (60) 


Ce 1 | al ee rere 


Let k —%> 1. The product APA) is now a polynomial of degree 
lower than k; this product is linearly expressible in terms of P,(A) for 
s < k, and, by (59), integral (60) is now equal to zero. Similarly, it is 
equal to zero for i — k > 1, since ay; = Gy, i.e. aj, = Ofor|t— k|> 0. 
Let us introduce the notation: 


b b 
ay = J APHA)do(2); By = J APA(A) Preas(4) dol’) (61) 
a a 
(k= 0,1, 2,...). 
The number b, appears in the linear form of the product AP,{A) 
in terms of P(A) (s = 0,1, 2,...,4+ 1): 

AP (A) = by Prgi(A) + Sry (62) 
and 6b, > 0, since the first coefficients of the P,,(A) are positive. It 
follows from (60) and what has been said that 

Bn = 95 Oy ety = Uta n = Oe; (63) 

a4, =0 for |t¢—k|>1, 


and the matrix of the transformation therefore has the form, given 
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our choice of base vectors: 


59000. 


0 by dy b, 0... (64) 


where b, > 0. A real self-conjugate matrix that satisfies conditions 
(63) is described as Jacobian. Therefore, given a suitable choice of 
base vectors, the matrix of a self-conjugate operator with a simple 
continuous spectrum is a Jacobian matrix. 

The coefficients in expansion (62) are readily calculated by using 
(59) and notation (61), by multiplying both sides of (62) by P»,(A)d e(A) 
and integrating with respect to A. When m < k — 1, the integral of 
AP, {A)Pm(A)d (A) vanishes, as we have seen, and hence it follows that 
the c%) = 0 for m < k— 1. We use notation (61) when calculating 
the remaining coefficients, and arrive at the following relationships 
between the polynomials P,,(A): 


AP, (A) == by Py y(A) + Oy PifA) + Opa Peal) (65) 
where 
P_,A)=0; P,(d)=1. (66) 


The last equation is due to the fact that we can assume oe(b) = 1, as 
mentioned above, whilst the first is regarded as a definition. As in the 
previous section, it is also possible to start from a continuous, non- 
decreasing function 9(A), form a system of polynomials, orthogonal 
with respect to o(A), and elements of a Jacobian matrix in accordance 
with (60). By what was said in [163], the elements of matrix (64) must 
be bounded in absolute value by the same number. This is also easily 
proved from (61). The above arguments lead to the result: 

THEOREM 4. Every self-conjugate matrix, corresponding to a bounded 
operator with a simple continuous spectrum, is the unitary equivalent 
of some Jacobian matrix of type (64) with bounded elements and b, > 0. 
We can obtain all such matrices from (60), where [a, b] ts any fintte inter- 
val, o(A) is a non-decreasing function which is continuous in this interval 
and is subject to the conditions o(a) = 0 and o(b) = 1 (the latter is not 
important), and P,(A) form a system of polynomials, orthogonal and 
normalized with respect to (A). 

We shall start from a given Jacobian matrix, on the assumption 
that the elements ay, = a; for |i — k| = 1 and differ from zero. If 
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we pass from the initial system of base vectors to a new system, by 
multiplying each vector by an expression of the form e'“, it may 
easily be seen that, given a suitable choice of w,, we get a Jacobian 
matrix that is the unitary equivalent of the given matrix, and is such 
that the elements a,, are positive for |¢ — k| = 1. We can therefore 
assume that the given Jacobian matrix has the form (64), where the a, 
are obviously real and 6; > 0. On recalling Theorem 2 of [163], and 
one of the corollaries of Theorem 1, we can say that the necessary 
and sufficient condition for matrix (64) to effect a linear bounded 
transformation is that the numbers a, and 6, be bounded by the same 
number N, independent of k: 


Ja] < M3 [Bel <<, (67) 
We shall assume this in future. Let 
Yo Yar Yor --- (68) 


be the fundamental system of base vectors. Let A be the self-conjugate 
operator corresponding to matrix (64); we can write 


Ay = bya Pymt + Ve + On Pets (F= 0, 1,-.-5 py = 0). (69) 

If we bring in the polynomials P,(A) defined by (65) and (66), we 
can use (66) to express any y, directly in terms of yp, by 

Ye = Py(A) ¥o (70) 


Let &, be the spectral function of the operator A, i.e. of matrix (64). 
It follows at once from (70) that the elements @, y, form the whole 
of H. For, by (70), 

b 

Ye = [Py (A) dF, Yo, 

a 
where a and b are the bounds of operator (64), i.e. p, is a limit of linear 
combinations of elements %, py), and every element can be expanded 
in base vectors y,. A simple (not necessarily continuous) spectrum 
thus corresponds to a Jacobian matrix, and the role of basic element x 
can be played by the first base vector py). We can write, on the basis 
of (70): 


(Pi Pe) = (P;{A) Po, Py (A) Wo) =(P: (A) Py (A) Po: Yo); 
(Ay;, Y,) =(AP; (A) Py (A) Yo: Yo); 


and, on introducing the function 


e(2) = (F,. Yo Yo) = || Poll”, (71) 
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we obtain from these equations: 


0 for i¢k, 
(PiPx) = {Pid #)a9 (2) =| 


1 for i=k, 
(AY, Y,) = (ap, (A) P;, (A) do (A), 


whence it follows immediately that the polynomials PA) form an 
orthonormal system with respect to @(A), and that the elements 
of matrix (64) are expressible in terms of them in accordance with (60). 
If the function #, has discontinuities, an eigenvalue with rank unity 
corresponds to each such discontinuity, as we saw in the previous 
section. It is clear from this that 2, cannot reduce to a finite number of 
jumps, and the same can be said regarding (A). Conversely, any Jacob- 
ian matrix can be constructed from (60), by choosing as o({A) any non- 
decreasing (not necessarily continuous) function, provided only that 
it does not reduce to a finite number of jumps. 


168. Differential solutions. Let us take a self-conjugate operator 
(matrix) with purely continuous spectrum. As we have seen, a sequence 
of mutually orthogonal normalized elements y® (s = 1, 2, ...) can be 
formed, such that the 2, y form mutually orthogonal subspaces 
H,, the orthogonal sum of which is the whole of H. The number of 
elements y may be either finite or infinite. Let po(a) (k = 1, 2,...) 
be the components of the element %,y®.The functions pA) are ae 
bounded variation, and, given any s in any interval contained in 
[a, b], they satisfy the equations 


Say, Ap (A) = { Ady (2) (72) 
k=l 4 
(f= 1.94.5) 


The following orthogonal properties of the solutions p{(A) can be 
proved [151]: 


> A, pl (A) - A, pP (a) = 0 (73) 


k=l 
(s 4 ¢; intervals 4, and A, arbitrary), 
1 2 
SAP (A)- A, p (2) = 0 (731) 


(4, and A, have no common interior points). 
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The fundamental formulae of [149] yield the following equations, 
containing the differential solutions: 


b 
 dpf) (Aydp (a) 0 (k #14), 74) 
| do, (4) 1 (k=?%), ) 
° apk (®) apf (2) 
p Pp 
n= 3/4 aay _ 
dpf (x) dp (u) 
n= 3 {Sgt ~ do(#)? a 


where 1;,(A) are the elements of the spectral matrix. If solutions of 
system (72), satisfying the orthogonality conditions (73) and (73,), 
have been obtained in some manner, and (76) has been proved for 
any A, we can be sure that the system of solutions obtained is complete, 
and that the remaining formulae hold. Let y and z be elements of l,, and 


yf) (A => ty PE (A); 29 (A) = > C, pO (A) 


k=l k=l 


Formula (272) of [149] can be written as 


dy()(a) dz@ia) 
(Ay, 2) =2 i A ay - (76,) 


It is an immediate consequence of (75). The remaining formulae of 
[149] may be written similarly. It is easily shown that, if a differential 
solution v,(A) (&k = 1, 2, ...) is orthogonal to all the above-mentioned 
differential solutions that form a complete system, all the v,(A) are 
constant. The case of constant v,(A) yields a trivial solution of system 
(72), since now 4v,(A) = 0 and dv,(A) = 0. The differential solutions 

p}(a) obtained after separating out the point spectrum, are obviously 
ee to all the eigenelements of the operator A. Let us return to 
system (72), and let p,(A) (k = 1, 2,3,...) be some differential 
solution of this system: 


2 Aix Ap,(A) = : Adp;(A) 
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Suppose that all the functions p,(4) have continuous derivatives 
pia). The Stieltjes integrals written now reduce to ordinary integrals 
of continuous functions, and, if we apply the mean value theorem to 
them, and write A’ and A” for the ends of the interval A, we obtain 


2 Ainl Dy(A") — ,(A’)] = A; pi(d;) (A” — 2‘), (77) 


where 4’ < 4; < 4”. On applying Lagrange’s formula to the left-hand 
side, dividing both sides by (A” — A’), and letting 2’ and A” tend to 
the common limit A, we obtain 


= Cin Pr(A) = Api(A) (78) 
a ine eres 


It is assumed here that we can pass to the limit term by term in the 
infinite sum on the left-hand side of (77). This will certainly be the 
case if the sum is finite, i.e. if the matrix a, has only a finite number 
of non-zero elements in each row, and hence in each column. It will be 
seen from (78) that, given these conditions, in the case of a continuous 
spectrum, the p;(A) (k = 1, 2, ...) satisfy for any 4 the same equations 
as we had for the eigenelements; but the pj(A) do not belong to J,, i.e. 
the sum, formed from | p;(A) |?, is equal to +-°°, since no eigenelements 
exist in the case of a purely continuous spectrum. In the case of a 
mixed spectrum, the differential solutions can be added to the ordinary 
solutions of /,, that yield the eigenelements. 


169. Examples. 1. Suppose that, in the interval [—~1, +-1]: 
| 
o(a) = = [visFaa. 
any 


Condition (59) for the polynomials P,(A) becomes 


0 forif#k, 


1 fori=k. i) 


+1 
2 Crees 
= if V1 — 2? PA) P, (A) dA = | 
=-1 


It may easily be shown that these conditions are satisfied by the poly- 
nomials 

_ sin(n+ 1) 0 _ 

P(A) = aaa we where cos 6=A4. 

We can easily show, by using de Moivre’s formula, that the fraction here is 


actually a polynomial of degree ” in cos 8. Condition (79) may be verified by 
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direct substitution, if we change the variable by putting 4 = cos §. The num- 
bers a, and 6, appearing in matrix (64) are defined by 


+1 +1 
aoe 2 i AVI—# PRA) dA; by = =. if AVI —# P,(A) Py (A) da. 
—1 a | 


On bringing in the variable 6 and evaluating the resulting integrals, we find 
that a, = 0 and b, = 1/2 for any k, ie. the elements of the corresponding 
matrix are given by ay, 741 = A414 = 1/2, whilst the remaining a,, = 0. This 
matrix has a simple purely continuous spectrum. The unique differential 
solution of the system is given, in accordance with (50), by 


a CF] 
pa(d) — [ni FE P,(i) dd = — = [sind sin n 0 a0, 
-1 % 


whence p,(A) = (2/2) V1 — 2 P,_, (A) = (2/n) sin n6. On throwing away the 
factor 2/nx, system (34): 

] 1 1 1 1 

Se a Ax; x tit sy Ca = Ares need “gta +H Ent = Aayy ++ 
is seen to have the solution x, = sin (narccos 4), where —1<A< +1. 

2. Let us consider the closed, orthonormal system (1//2z) e/ (k = 0, +1, 

+2, ...)in the interval [—2, +2]. Taking o(A) = A, (651) gives us the follow- 
ing elements for the corresponding matrix: 


+n 
a8 1-Maga— <= 1P™ = 
pg = 5 | Me npg eee App = 0,7 
—2 


where the subscripts p and gq run from (— ©) to (+0). In accordance with 
(50): 


A 
1 1 
Pit —i gj. 9 (A) = ea tka 
—% 


and (78) lead to the equations 


oo —k 
Py (ar. : eee ae go gist 
(Hin 10 —*) * Yin in?’ 
where the prime on the summation sign indicates that we have to exclude 
k=es. 

The last equation can be rewritten as 


oo —K 
> » (=P ows-a_g 
jas a (8 —k) ; 


or as 
> (=) ia A 
y , 


— 
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the value j = 0 being excluded. This equation is an ordinary Fourier expansion 
of 4, the series being divergent at the ends of the interval, i.e. for A = 4-2. 
This last is due to the fact that the expansion is written in the complex form. 
Let us now apply (56), putting f(A) = —2 — A for A <0 and f(A) =a — A 
for A> 0. We obtain the matrix 
0 n 
sch [ py qyale-o ze { i(p—aya 
bog = aK eT di +—— [ (n—Aje da, 
—n 0 
or 
t 
p—q 
Since | f(A) | < 2 in the interval [— 2, -+- 2], we have the following bound for 
the quadratic form [155]: 


Ong = for p #q and by, = 0. (80) 


+t Edy +t 
a pigi|<* > Iki? (81) 
P,q=—s ka~s 


The factor 7 on the left of this inequality can obviously be thrown away. 
If we put ¢, = 0 for p < 0 and the remaining §, are real, we get the following 
inequality, given by Hilbert: 


7 Entg ee 9 
P39 <a 2 f. (82) 


It may easily be shown that the matrix with elements a a: 1/(p — ¢) when 
p #4, and ap, = 0, no longer corresponds to a bounded operator. For, if we 
put = l//n for l<k<n and & =—0 for k >n, the norm of the element 
(41, 42, 45, -.-) will be equal to unity, whilst the corresponding quadratic form 
becomes 


= &pbq 2 ea 7 ae os a | 1 2 
a ane oe a= fe a et + | = 
Pa n = pa pa 2 p—2 A) 
_ 2(n—1 , n—2 nm—(n—1)\ __ 
Pola ee 
1 1 n—l 
=2(1+54. Aged Se ) 


and this last expression increases indefinitely as n increases, since the sum 
1 + 1/2 +... + 1/(n — 1) increases indefinitely, whilst the fraction (n — 1)/n 
tends to unity. In case (80) we do not have absolute convergence of the infinite 
double series, but we can say that, for any two elements of l,, the limit exists: 


lim WS, <, fom. Sl S_&_.]— 
Pee: Pe jut - S| 3 stilt 


q=1 p=l 


the terms on the left for which p = g, and the value p = g in the inner summa- 
tion over p, being excluded. 
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3. Let us now take, instead of e!*4, the real closed orthonormal system 
1 
(A) = —— (sinkA+coska 


(k =0,4+1,4+2,...). 
On applying (56), we arrive at the matrix 


+2 
] ; Fs 
bpg = Ga | 1A) (sin A + cos p A) (sin g-+ cos gA) da, 


—n 
or 


ims juisaiee g)Ada + Lf yaysinp + qaas 


On putting, as before, f(A) = —x— 4 for 4<0 and f(A) =2-—A for 
A> 0, we get the following matrix: 


1 
sae eae and bp, = 0 for p+q=0. 


We arrive at the inequality, analogous to (81): 


<n Dd 1&)% (83) 
s 


where the prime indicates that we have to exclude the terms for which p+ ¢=0. 
The inequality corresponding to (82) is 


> Spt. <a >t (84) 
geen’ ee a 


All the numbers can be assumed positive, and this double series is absolutely 
convergent for any element of J,, so that we can write: 


<- §p&q 
<a > Ef. 85 
P+ > (85) 


170. Weak convergence in /,. Let x(é{?, &9,...) (n = 1, 2,...) 
be a sequence of elements weakly ecaverscnt to the element 2(&,, &, 
...), ie. 2” => x. This can be written as 


ls — HP P+ 0 as n> 0, (86) 
It follows from this that 
Seer D> ee (87) 
k=}! k=1 


and the || 2” || are bounded by the number J (independent of 7). 
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It follows at once from (86) that 
EO > Ep (88) 
(an ae ee 


but (86) does not follow from (88). 
Let us show that condition (88), together with the condition that 
the norms of elements x) be bounded: 


Bs sa (89) 
k=l 


is equivalent to the weak convergence 2 “, x. 

If we have weak convergence, the || 2 || must be bounded, i.e. 
(89) must hold for some choice of / and, in addition, we must have 
(a, Pr) —> (%, Mx) ,where gy, are the base vectors mentioned pre- 
viously, whilst this leads to (88). 

Conversely, if conditions (88) and (89) are fulfilled, the weak con- 
vergence of 2 to x follows at once from what was said in [132]. 

We can therefore state the theorem: 

THEOREM. Conditions (88) and (89) are necessary and sufficient for 
the existence of a weak limit of the sequence of elements x(E,, &(, ...), 
and if they are fulfilled, the limiting element has the components (&,, &., 


diay! 


171, Completely continuous operators in J,. We have already 
obtained [108] a sufficient condition for an infinite matrix to define 
a completely continuous operator in /,, viz. if the double series 


> | nm!’ (90) 
n,m=l 
is convergent, the matrix a, defines a completely continuous operator 
in J,. 

The convergence of series (90) is merely sufficient for the operator 
defined by the matrix a,m to be completely continuous. It can be 
shown that the necessary and sufficient condition is that the passage 
to the limit indicated in (8) should hold uniformly for x and y, the 
norms of which do not exceed unity. The equation (HZ — pA)zr = y 
has the form in 2, 


bn — tS Ganin =" (91) 


m=1 
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where (7, %, .-.) is the given and (&,, &, ...) is the required element 
of 1,. If A is a completely continuous operator, everything said in [135] 
holds for system (91). Let A be a self-conjugate completely continuous 
operator and yp, (k = 1, 2, ...) be a complete orthonormal system of 
its elements. Let U be the unitary operator in 1, defined by the con- 
ditions y, = Ug,, where g, are the previous base vectors in /,. If we 
take the y, as the new base vectors in /,, the operator A now becomes 
B=UAU-!, Its components are given by {B}am = (BYym Yn) = 
= An(Ym Yn), Since Byy = Am Ym. 
Consequently, 


A,, for m=n 
3 i en : 92 
{Bn | 0 for m#n, es, 


i.e. a diagonal matrix corresponds to the operator B in the base vectors 
y,, the diagonal being composed of the eigenvalues of the operator. 
This remains true for any linear self-conjugate or unitary operator 
with purely point spectrum [146]. 

The operator A is the unitary equivalent of B, in fact A = U-! BU. 
In view of what has been said, we can assert that the matrix correspond- 
ing to a completely continuous self-conjugate operator in J, is the 
unitary equivalent of a diagonal matrix in which the diagonal elements 
Am satisfy the conditions given in [136]. 


172, Integral operators in Z,. We have already considered integral 
operators in Ly. Let us now consider them in more detail in L,;: 


b 
gp (x) = J K (x,y) f (y) dy, (93) 


where K(z, y) is a measurable function in the interval dy (a < x < 5b; 
a < y <b), and hence is measurable with respect to y for almost all 
of [a,b], and vice versa. Suppose further that it belongs to Z, as a 
function of y for almost all 2, and vice versa, i.e. 


b 
K* (x) = J | K (x,y) Pdy < +0, (94) 


b 
Ki(y) = {| K (zy) Pdz < + 0, (95) 


where K(x) and K,(y) are measurable non-negative functions [67, 68]. 
It follows from (94) that, given any f(y) € Z., integral (93) exists for 
almost all x, and (zx) is a measurable function [67, 68]. The necessary 
and sufficient condition for (93) to be a linear bounded transformation 
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when (94) holds is that, given any /(x) of Z,, there exists a positive 
number NV such that 


b 6b 
Setar = Sf K (am why ) dy [? da < Nef Uy )2dy, (96) 


There is a simple sufficient condition for the operator corresponding 
to the kernel K(x, y) to be bounded, precisely analogous to the con- 
dition for the matrix to be bounded: a positive number / must exist, 
such that 


6 
§|K (a, y)|dy <7 and {1K ( v,y)|da< (97) 
a 
It is sufficient to show that the corresponding bilinear functional 
must be bounded. On replacing all the functions by their moduli in 


the iterated integral expressing the functional, the iterated can be 
replaced by a double integral: 


bb. 
J S| (ey ll fw) Ife (@) [dedy < 
bb 


<= [ [|X@MITA@E+Ife()P] dedy = 
b 


“F] 


+f [fig ee slay]in par 


a 


b 
[ J 1X @.a)| de] A, @ [Pay + 


b 
<= fiA@lay+ ‘ fs (2) |Pdz] - 
a a 

But this last expression is equal to J, if ||/,||= f ft. | =1. By using an 
exactly similar method of proof, a more general sufficient condition 
can be given for operator (93) to be bounded, viz. there exist a 
positive number / and a positive function w(x), continuous in [a, 5], 
such that 


b r 
S|K (ey) oly) dy <la(x); J |K(,y)| ola) dx <lely). (98) 


173. The conjugate operator. In the case of a bounded operator, 
no meaning may attach to the integral 


PK (z) ax) dz, (99) 
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where the non-negative function K(z) is given by (94), for certain z(x) 
of L,. The set of the t(x) of £, for which it has a meaning is obviously 
a lineal J in Z,. 

THEOREM 1. The lineal I is everywhere dense in Ly. 

We have to show that the closure of J gives the whole of L,. If this 
were not the case, there would exist a non-zero element x(x) of Ly, 
orthogonal to the subspace formed by the closure of J, and hence to all 
the r(x) of 7. It is therefore sufficient for us to show that, if a(z) = 
== 7,(x) + tz,(xz) is orthogonal to all the r(x) of 1, ie. 


.b __- 

| x(x) x(x) dx = 0, (100) 
a(x) must be equivalent to zero. We choose 1(z) of J in a special way. 
Let m be any given finite positive number, e,, the set of the x for which 
K(x) < m and ej, any part of e,, of measure < m. We define 1(z) so 
that t(z) = 1 if x € em, and t(x) = 0 for other x. This t(z) belongs 

to 1, and we obtain on applying (100): 
f(x) dx = {[x, (x) — ix, (x) ] dx = 0. (101) 
en ein 


This equation holds for any part eé7,, 80 that, e.g., 
fj ap (x) dx = 0, (102) 
ein 


where x; (x) is the positive part of 2,(z), i.e. x;'(x) is equivalent to zero 
on é/,. If m increases indefinitely and use is made of (94), x(x) is 
seen to be equivalent to zero in [a, 6]. We can similarly assert the same 
thing for x; (x), 27 (x) and 22 (z), and the theorem is proved. Notice 
that we have only made use in the proof of the fact that K(z) is any 
given non-negative function, finite and measurable almost everywhere 
in [a, 6]. We shall in future write 7, for the analogous lineal for the 
product K,(x)1(z). It is also everywhere dense in L,. 

THEOREM 2. If (93) defines a bounded operator, the conjugate operator 
is the integral operator with the kernel 


K* (x,y) = K (y, 2). (103) 


On writing A for operator (93) and using the definition of the con- 
jugate operator (Az, y) = (z, A*y), we can write 


a] 6 a b as 
{ [SK (zy) ty) dy]g (a) dx = (rt (yg*lydy, (104) 


a 


where g*(x) == A*g(x), and it is assumed that t(y) € d. 
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On using the inequality 
b = b ts b Ys 
§ | K (x,y) g(a) | da <(§|K (xy) dz) -(f |g (x) Pde) = 
a a a 


= K,(y)- |Ig|| 
and the fact that z(x) € /,, we can say that one of the iterated integrals 


exists for | K(z, y)(g)zt(y) |, so that we can change the order of in- 
tegration in the integral on the left-hand side of (104), ie. we can 
rewrite (104) as 


b ses ceca 


b 
Sr(y)| § K (x,y) g(x) dx — g* (y)|dy =0. 
a a 

A repetition of the proof of Theorem 1 shows us that the difference 
in square brackets is equivalent to zero, and we can write, on passing 
to the conjugates: 


6 
g* (y) = | K (x,y) g (2) da, (105) 


whence the theorem follows. In view of this theorem, the equation 
(Ax, y) = (x, A*y) can be written for integral operators as 


6 6 iat, 6 = : 
SL) K@yfly)dy] g(x) da = f [fk (x, y)g (x) da] f(y) dy, (106) 


which amounts to saying that the order of integration can be changed. 
The corresponding double integral may not exist. If, in addition to the 
conditions indicated, the kernel satisfies 


K (2, y) = Ky, 2), (107) 


operator (105) is the same as (93), i.e. (93) is a self-conjugate operator. 


174, Completely continuous operators. We saw above that, if a 
function K(x, y) measurable in the square A, satisfies 


Jf |K (wy) Pdrdy < +c, (108) 
4, 


operator (93) is completely continuous in Z,. What was said in [135] 
holds for the integral equation 


b 
§ K (x,y) f(y) dy = Af (x) + p(2), (109) 


where (xz) is the given and f(x) the required function of L, in [a, b). 
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If operator (93) is self-conjugate, i.e. K(x, y) is equivalent to K(y, a), 
what was said in [136] 1s applicable to equation (109). Let us show 
that integral (108) ts equal to the absolute norm of operator (93). A pre- 
liminary lemma is needed. 

LEMMA. If o,(x) (n = 1, 2, ...) ts a@ closed orthogonal system in the 
interval [a,b], Omn(Z; Y) = Pr(Z)Pnly) ts a closed orthonormal system 
in the square Ag. 

By hypothesis, 

0 for 74k, 


b aS 
(i dx— 
fix) a (ede f aan 


The functions 9mn(z, y) obviously belong to L,(4,), and by Fubini’s 
theorem, 


b 5 __ 
$$ Gn (ZY) Ppq (2 y) Ardy = | Gm (2) Pp (x) Ax Y, (Y) Py (y) dy, 
A, a a 


whence it follows that system ym,,(x, y) is orthonormal in A). To prove 
that this system is closed, we only need to show that, if f is orthogonal 
to all the mn, it is equivalent to zero in A, [58]. 

Thus, let 


5 SF (2,9) Om (2) Pn (y) dz dy = 0, 
A, 


b 


b 
S[ Jf ¢ (y) dy} Fn (x) da = 0, 
and, since system 9,,(x) is closed, we obtain on passing to conjugates: 


J } (x,y) p(y) dy = 0 alinost everywhere with respect to x in [a,b], 


and it can be asserted, by the same arguments, that f(z, y) = 0 almost 
everywhere in 4,, and the lemma is proved. Let bm, be the Fourier 
coefficients of the kernel K(x, y), belonging to L, in Ay; then, by (108), 


bmn = SJ K (&, Y) Pm (2) Pn (y) dx dy. 


We find the square of the absolute norm of the operator A corres- 
ponding to this kernel [138]: 


N{A)= S| (AGa Pm)? = DS 


b 6 aeceeeeree 2 
SI. {K(2,9) en) ay| Pra) de 


Pm (2) ) Pn (y) dzdy | = | bmn 
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but, by the closure equation, the last sum is equal to the integral of 
(108). If condition (108) is fulfilled, and A is a self-conjugate operator, 
we have 


§ JK (ay) Pdxdy = & i, (110) 


where A, are the eigenvalues. 
Everything said in [135] and [136] is preserved in the case of an 
infinite interval, even for the multi-dimensional operator 


y (a) = (K (x,y) f(y) dy, (111) 
D 


where 2(%1, X, ---, Xn) and y(Yy, J, -- +, Yn) are points of n-dimensional 
space R,; dy = dy,, dy, ..., dy, and Dis a domain of Ry. 


175. Spectral functions. Let K denote the operator (93), which we assume 
to be completely continuous and self-conjugate. We shall describe how to form 
the spectral function &, for it, and the resolvent R, = (a — 1H)"}. 

We introduce instead of @, another function which is expressible as an integral 
operator, viz. we put 


0, = 6, for 4<0; 6, = 6, — E for A>0 and 6,=0. (112) 


Since the spectrum is purely point, we can say that €, is a projection operator 
onto the subspace of the eigenfunctions 9;(x) for which A, < A. The projection 
of the function f(z) onto the one-dimensional subspace of the eigenfunction 
y,(x) is the product a, 9;(x), where a, are the Fourier coefficients of f(z): 


b 
Ay Py (x) = 3 Py (©) Py (y) f(y) dy. 


Thus the projection operator onto the one-dimensional subspace is an in- 


tegral operator with kernel 9;(x) 9;,(y), and we can write 


b paeee 
Of (ey = § DS oy (&) oly) fly) dy for 4<0, 
a ASA 


where the summation is over the k for which A; < A, and the sum contains a 
fimite number of terms, by virtue of the property of the spectrum of a complete- 
ly continuous operator. Thus, when 4 < 0, 6, is an integral operator with the 
kernel 
0 (x, y34) = 2 (x) My) for A< 0. (113) 
ABSA 


On using (112) and what has been said regarding €,, we can say that 4,, 
with A> 0, is an integral operator with kernel 


8 (a, y; A) = — Fe (x) , (y) for A> 0, (114) 
ke 


where the sum again contains a finite number of terms. When A passes through 
an eigenvalue, the kernel has a jump. It follows at once from (112) that 63 = 6, 
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for 4 <0 and 63 = —96, for 4 > 0, which can be written as 
b 
foe, ts A) O(t, y3 4) dd = + O(a, y; A) { 


a 


+forA<0, 
—forA>0O. cro) 
The function R, f(z) is obviously @ solution of equation (109) with A = 1, 
on the assumption that J 4 0 and does not coincide with one of the A,. Another 
expression can be found for the resolvent, by starting from the equation 


+o 


R, f(x) = | hc LL (116) 
where the integration is actually carried out over a finite interval containing 
the spectrum of the operator. 

Let 4 = 0 not be an eigenvalue. On replacing @, by 8, in accordance with 
(112) and taking into account the extra jump of 6, on passing through 
4 = 0, equal to (— EZ), we can write the above equation in the form 


6, 


b 
jf da LJ 6(@, 95 4) f(y) dy] 


R; f(a) = — > fa) + lim ~—7—— + 


e€y>+0 
£,%+0 -_ 


oo 


b 
da [ § a(x, ys 4) fy) dy] 
jee 


ae (117) 


&, 


176. The spectral function (continued). The spectral function was introduced 
for very general integral operators by Carleman in his work Sur Les Equations 
Intégrales Singuliéres a Noyau Réel et Symmetrique (1921). An exposition of 
this theory from the modern stand-point may be found in Stone’s Linear 
Transformations In Hilbert Space... (1932), and in N. I. Akhiezer’s article 
‘“Integral’nye operatory s yadrami Karlemana”’ (Integral operators with Carle- 
man kernels) (1947). Integral operators of a more general type than the self- 
conjugate bounded type (to be discussed in a moment) are investigated in 
these works. The case of bounded self-conjugate operators was considered by 
Hilbert, Hellinger and others. 

The results for this last type will be given in broad outline. The kernel 
K(x, y) of a bounded self-conjugate operator K can be approximated by kernels 
K(z,y) (n = 1,2, ...), which correspond to completely continuous self- 
conjugate operators. This gives us the possibility of showing that the operator 
6,, given by (112), where &, is the spectral function of operator K, is an integral 
operator for 4 4 0, and that the formulae hold: 


6 
0, f(x) = s a(x, y; 4) f(y) dy , (118) 


b +~ b 
J Ke, y) Ky) dy= J a, [ J A(a, y3 A) f(y) dy], (119) 


—oo 
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as also (115). The integral on the right-hand side of (119) is to be understood 
as improper as regards 4 = 0. 

We shall assume that operator (97) does not have a point spectrum, and we 
deduce for it the formulae of the general theory of [149]. Let w,(z) be the ele- 
ments of LZ, corresponding to the y; of [149], where it can be assumed that the 
(x) form an orthonormal system. By making use of (x, y; 4), @ complete set 
of differential solutions can be obtained: 


b 
ty (a, A) = f O(x, y; A) wp(y) dy for A<0, 
a 


ty) 
r(x, 2) = J (a, y; 4) w(y) dy + o,(x) for A>0. 


The operator 6, has a jump ae to (—#) at the point A = 0. 
We have 


ox(4) = j | m(@, A) |? der, (120) 
and, if g(x) and y(x) are any two elements of L,, on putting 
b 
= ) mty(a, A) y(a) dx, (121) 
b 
= J mlz, 4) p(a) da, (122) 
we can write the formulae of [149] as 
b en 
a Agg(A) dy (A) 123 
| Hx) Wa) dx = 3 J eat (123) 
ot 9 1g,(A) dhy(A) 
pla) ee fl Bole ae 124 
J hee ¥) Hy) WE) de dy = 3 ae Seah (124) 
: Agate dh a) 
Hodee D>, | eee 125 
| oe) vei ae = 3 | Sr ee (125) 


a m 


where m and M are the bounds of the operator, whilst the integral on the left- 
hand side of (124) must be understood as iterated in any order. If g(x) belongs 
to the lineal Z on which integral (99) has a meaning, (99) exists as a double 
integral. The remaining formulae of [149] have the form 


M — 
dgy, (A) 
p(t) = PY “Jordy o%® a), (126) 
b 
[xe new dy = 3 j 1 BO ang (2,2), (127) 
& P(x) = Pal HO an My (@, ft): (128) 
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In the case of an infinite number of terms, the series must be convergent 
to the quantities on the left. 
The differential solutions 2(zx, A) satisfy the equation 


b a 
§ K(a, y) my, A) dy =f pda(a, n), (129) 


a m 


where we assume 2(2%,m) = 0 as usual. The orthogonality properties of such 
solutions are expressed by 


b Ss 
fA, Np (X, A) > D2 mq (x, A) dx = 0. 
a 


(p #q)- 


The above formulae can be obtained by starting from any complete ortho- 
gonal system of differential solutions 2,(z, 4). Examples of integral operators 
in LZ, will be considered in subsequent sections. 


177, Unitary transformations in L,. Not .every unitary trans- 
formation g(x) = Uf(z) in L, is expressible in the integral form. 

The identity transformation g(x) = f(z) can be quoted as an 
example. But it can be written in the integral form if we pass to the 
primitive function: 


ply) dy = “f K(w,y) fly) dy, 


CL 


where (—°o, +-°°) has been taken as the basic interval, K(z, y) = 1 
for 0 << y<axand K(z,y) = 0 for y < 0 and y ><a, if > 0, and 
similarly for « < 0. A similar result holds for any bounded operator A. 
Let (—9°, +0°°) be the basic interval. We fix some value of 2 and 
consider the primitive for A/(z): 


Uf) = f [ARO] de. 
0 


We have the distributive property for U(f), and, by Buniakowski’s 
inequality: 
x A. & bs 
MAL <[ flake PaePl (de? <All VellAl (@ > 0). 
() 0 
It follows from this that U(f) can be regarded as a linear bounded 


functional, depending on the parameter z, and, by the theorem of 
[123], we have 


{ [Af(t)] dt = ie K(x, y) fly) dy, 
0 — 90 
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where K(z, y) € L, (—9°°, +0°°) with respect to y (for any x € (— 
-Foe)) and 

foo 

f |Ktw, y)P dy < ||AlP a. 

We shall shortly prove a theorem that gives the general analytic 
form of a unitary transformation with the aid of a passage to the 
primitive function. This theorem was first proved by Bochner (Annals 
of Mathem., Vol. 35, No. 1, 1934), who considered the interval 0 < 7 < 
< +o, The proof does not depend on the choice of interval. We shall 
take (—co, +c) for definiteness and write L, for the class of functions 
of Z, in(—°o, +°°) (see F. Riesz and B. Sz. Nagy, Legons d’ Analyse 
Fonctionnelle, p. 316). 

THEOREM, Let K(x, y) and L(x, y) belong to L, with respect to y for any 
fixed x of (~°c°, +00), whilst the formulae hold for all a and b of 
(—©, +oo 


): 
+00 

K(a, y) K(b,y)d 
Ly eu) y) dy n (|a|, |b|) for ab > 0 


o ‘trap, ‘0 
i L(a, y) L(b, y) dy 
and 
b a 
{ K(a,y) dy = J L(b,y) dy. (131) 
Now, the formulae 
a +00 
Sey)dy= J Lay) fy)dy, (132) 
a foo 
J fy)dy= J Klay) ety) ay (133) 


define a unitary transformation g(x) = Uf(x) and its inverse. 
Conversely, if p(x) = Uf(x) is a unitary transformation, there exist 
functions K(x, y), L(x, y) with the above-mentioned properties, with the 
aid of which U and U-1 can be expressed by (132) and (133). 
Let us start by proving the first half of the theorem. 
We introduce the function /,(2) 
'l for O<@a<a 
fae) = {9 for zx <0 and t>a 
1 fora<xz<0 
fal) ={ for <a and r>0 


(a > 0); 


(a < 0) 
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and /,(x) == 0, and we define operators U, and V, as follows: 
K(a,x) = Upf.(2); La, 2) = Vofa(x)- (134) 


On forming all possible finite linear combinations of functions /,(z) 
for different a, we get a lineal J of piecewise constant functions, i.e. 
functions that take constant values on a finite number of finite inter- 
vals and are zero outside these intervals. The values of the functions 
at the ends of the intervals have no importance here, since equivalent 
functions are regarded as identical. We extend operators U, and Vy 
on to the lineal J, taking as our basis the distributive properties of Uy 
and V,. It may easily be seen that this extension is unique. Let Uf 
and Vf denote the distributive operators on J. 

Formulae (130), (131) and (134) give us 


(Uofa Oo fy) = (fas fs): (Vo fas Vo fo) = (fas fe) » (135) 
(Oo far fo) == (fas Vo fr) - (136) 


Since the operators are distributive, we can write the same formulae 
for U and V onl: 


(Uf, Ug) = (fg); (Vf. V9) =(f9); (137) 
(Uf, 9g) = (f, Vg), (138) 


where / and g € L. It follows from (137) that U and V do not change 
the norms of elements on J, and, since the lineal J is dense in L, [60], 
we get a unique extension of U and V (in continuity) to the whole 
of L,. In view of the continuity of the scalar product, (137) and (138) 
are preserved in Z,, and operators U and V do not change the norms 
and scalar product in Z,. It follows from (138) that V = U*. On re- 
placing f by Vf in (138) and using (137), we get VU* = E, and simi- 
larly, on replacing g by Ug, we get U*U = #. Hence it follows that U 
is a unitary transformation and V is its inverse [137]. It remains to 
obtain (132) and (133). The first is obtained from (138) by putting 
g(x) = f,(z), and the second by putting /(z) = f,(x) and passing to 
the conjugates. 

Let us turn to the proof of the second part of the theorem. Given 
the unitary operator U, and V = U-1! = U*, we form the functions 


K(a, x) = Uf,(2); La, L) = Ure Fa{x) ‘ (139) 


On introducing the notation g(x) = Uf(x) as above, and using the 
fact that U is unitary, we get 


(P; fa) = (Uf, fa) = (f. Ur ta) (f, fa) = (U1 P, fa) — (9, Uf.) ’ 
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which, by (139), leads to (132) and (133). Formulae (137) and (138) 
hold for the U and V given above. On putting f(z) = fa(x) and g(x) = 
= f,(z) in them, we get (130) and (131). The proof is complete. 


178. Fourier transformations. Watson considered the interval 
O < x2 < -+co and a kernel of the form 


K(a, 2) = xc) : L(a,2) = dee) : 


on the assumption that ¥(0) = 0 and yx(z)/x € L,(0, +°°). 
Conditions (130) take the form 


oo 


{ Ha 2) 208) gay min (a,b) (a> 0; b> 0), 
0 
whilst (131) is fulfilled automatically. 
Let us turn to Fourier transformations for which the basic intervals 
is (—°c, +00) and 
1 eM} 1 fF] 
K = Se »t) = —— 
ae) V2x —~ 4x Ha, 2) V2x 1x 
The modulus of the numerators does not exceed 2, and both functions 
belong to Z,. Condition (131) is easily verified. Let us verify (130). 
These reduce to a single oapresion; as above: 


(140) 


(eT tax (elt min (|@|,|5|) for ab>0, 
LF ecttan ohn gy (mindeblen ai 
oH J 0 for ab< 
By differentiating with respect to the parameter a, it is easily seen 
that 


+a 
f Sn? de =mija| (a real), (142) 


The integral J is easily transformed to 


and application of (142) gives us 


=F(/a|+|b|—|a—b)), 
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whence (141) follows. We thus obtain a unitary Fourier transformation, 
for which we employ the symbol 7: F(z) = T/(z), in the following 
form: 


a too = 
[ Fede == f ——* payae, 
firayax = f =1 pena (143) 
x) dz = —— - x) az. 14 
F Von J 12x 


Another possible form of the Fourier transformation was employed 
earlier [IJ; 160]. 

Suppose first that f(z) vanishes outside some finite interval [—7n, 
-+-n]. By hypothesis, f(z) € LZ, in [—n, +7], so that it € L, in [—7, 
+n]; thus, given any real y, the integral exists: 

+n 
eC f e- OF f(a) dav. (144) 
¥2a 


—n 


It may easily be shown that F',(y) is continuous and has derivatives. 
We have |e-”(x) | = | (x) |, and we can integrate with respect to y 
over a finite interval under the integral sign: 


? oe —tax 
[ Fun) dy == [ fever. 
F 27 — iw 
0 —n 
On comparing this formula with (143) and noting that a is arbitrary, 
and that f(x) vanishes by hypothesis outside [—z, +7], we can say 
that F,(y) is equivalent to F(y), i.e. in the present case the Fourier 
transformation can be written as 


+n 
= I —lyx 1 
Fly) = = f eP* fla) de. (145) 
In the general case, the integral 
in 
a 146 
7 [ oP* fle) de (146) 


—oo 


may be meaningless, since the fact that f(x) belongs to L, in the 
interval (—°c°o, + °°) does not imply that it belongs to L,. Let us take 
the function f,,,(z), which is equal to /(xz) for —n < x < m and zero 
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for x > mand x < —n. As we have seen, the Fourier transformation 
of this function is given by 


Frum(Y) = a= [ O* fam() A (147) 


But fam(x) => f(x) in Z,asn—> cand m—> ©, so that Tfnm (x) > 
=> Tf(x). If lim is used to denote the limit in the mean (in L,), we can 
write the Fourier transformation for any function of L, in the form 


Fly) = Tf = lim —— = ‘| oY f(x) dar. (148) 


If f(z) belongs to D, as well as L, in the interval (—°e, -+-°°), given 
any real y, integral (146) exists, this being the limit of integral (147) 
as m and n—> °°, But, if the limit in the mean, and the limit every- 
where, exist, they are the same, so that the Fourier transformation 
can be written in the present case as 

1, 
Fly) = Tf = see [ eP* fa) de (f(x) Ly and L,). (149) 

Everything said above is true for the conjugate (inverse) trans- 
formation. On putting m = n, we have instead of (148): 


N-> oo 


+n 
* aenaae lyx 
f(y) =T* F =lim va f° F(a) da (150) 
—n 
and instead of (149): 


+00 
fy=T* F= ale F(x) da (F(x)€L, and L,). (151) 


A convolution formula, to be obtained shortly, holds for the Fourier 
transformation [cf. IV; 45]. Let g(t) and f(t) ¢ L,. Given any real z, 
g(x — t), regarded as a function of ¢, clearly belongs to L,. Let us find 
T [g(x — t)]: 


Tig(x —t)] = lim —— fg (a — the— dt = 


—a 
x+a ; x+a 
= lim ra J u) e~V@-4) dy = e- 9* lim — | g(t) ede, 
a-yoo Yon a ) goes ¥2zx gi ) 
x-a@ 


ie. T[g(x — 1] = 2 ™T4 T(t), and we can write, on recalling that the 
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unitary transformation does not change a scalar product: 


f gla — 0 fl) at =f 0 T* [g(t)] TUFO) dy, 


+n 
T* [g(t)] = lim “e [anertar 


—n 


pantie — jj elt * 


+o 


+20 . 
{ gz —t)f(t)dt= ( G,(y) Fy(y) e-®* dy, (152) 
where G,(y) = T*g and F,(y) = T*f. 

The basic theorem can be proved in precisely the same way for 
functions of several variables. Here, the unitary transformation 7' 
is given by 


+m, +M, 
Tf = lim [ f(y. - +) Sq) oH t +9) dz... da, (153) 
i esl (2) 2 —m —Mn 


and the inverse transformation by 


+My 
T* (F) = lim : ie | FY py «<3 Yn) HOt 80) dys... dY py. 
My—Poo (27) 2 —m —mn 


(154) 


Returning to the case of a single variable, certain further properties 
of the Fourier transformation may be mentioned. If we replace x by 
(—2z) in (148), compare with (150) and remember a 7T* —T-1, we 
get T?#(x) = f(—2x), and similarly, T**F(y) = F(—y). If f(z) is an even 
function, the transformation 7’ yields a Raion sells to an 
even function, and we have 


n 
F(y) = lim = fre )cos zy daz; f(x) = lim = [ Fw) cos xy dy. 
Tico Mos oo It 3 
These formulae give a unitary transformation for functions of Z, in 
the interval (0, ©). On changing the sign and multiplying by 7 (these 
operations are clearly unitary transformations), we obtain, in the case 
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of odd functions, the following mutually inverse unitary transfor- 
mations in (0, ©): 


F(y) = lim 2. f fe) sin zy dx; f(x) = lim 2 f F(y) sin xy dy. 
0 0 


fN—Poo Ti-poo 


179. Fourier transformations and Hermitian functions. We shall prove next 
that the Fourier transformation has four eigenvalues +1, +7, to which there is 
a corresponding closed system of eigenfunctions, orthonormal in (— o, + 0); 
these functions are, in fact, the Hermitian functions [III,; 156]. Let us recall 
the basic formulae concerning Hermitian functions. The Hermitian polynomials 
are defined by 


n 


H,(z) = (— Inet © 


aa" (o-**) 


and the Hermitian functions by 
x! 
y,(z) =e 4 Hi, (2). 
They are orthogonal in the interval (— oo, + 0). The normalized Hermitian 


functions are: 
1 


Gn) = ———— v2) « 
— 4 
2° Yn Vx 
They form a closed orthonormal system. Let us show that ¢,,(2) is the eigen- 
function of the operator 7, corresponding to the eigenvalue (—7)", i.e. 


Tq == (— 2)" Ppl) - (155) 

In other words, we have to show that 
+= x? y* 
il —iyx+> q? x qn 
L=—= 2 _—__ (9—2") da = (— t)ne 2 —— (0-9). 
: fie pr Ode (syne F So“) 

We integrate by parts and note that the terms outside the integral sign 
vanish: 


We now multiply by e)*/2 outside and by e-)*/8 inside the integral sign: 


(yr BF at deny 
l= e? i e-x* 9? dx = 
Y2n da” 


ey ee nL iy 
Pe ee f ouasa 5 ga SOON 
Yy 
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On differentiating with respect to the parameter y, the last integral is easily 
shown to be equal to e-)*/2 , and (155) is therefore proved. If account is taken 
of the closure of the system of Hermitian functions, it can easily be shown 
that the points A= +] and A= +7 exhaust the spectrum of operator 7’. 


180. The operation of multiplication. Let us consider the operation 
of multiplication by the independent variable in a finite interval, 
x = 0 being taken as the left-hand end of the interval. In other words, 
we consider L, in the finite interval [0, a], and the operation of mul- 
tiplication by the independent variable: 


Af(x) = af(x). (156) 
We have 


(Af, 9) =f af(x) g(x) dz and (Af, f) = f2| fo x) |?dz, 


whence it is clear that A is a self-conjugate operator, and that its 
norm does not exceed a. If we take f(x) that are non-zero only in a 
small neighbourhood of 2 = a, it may easily be seen that the norm 
of A is exactly equal to a. Given the condition || f || = 1, the bounds 
of the quadratic form (Af, f) are: m = 0 and M = a. The equation for 
the eigenvalues and eigenelements has the form 2f(x) = Af(z) or 
(x — A)f(x) = 0, whence it is clear that f(x) is equivalent to zero, i.e. 
there are no eigenvalues, and the spectrum is purely continuous, 
The resolvent clearly has the form &,f(x) = f(x)/(x — A). If A lies 
outside [0, a], then R,f(z) € Z,. If 4 is in [0, a], f(x) does not belong 
to L, for every f(x). The operator (A — AE)f(x) = (a — A)f(x) here 
transforms L, one-to-one into the lineal M, of functions g(x) = 
= (x — A)f(x) such that y(x)/(z—A) € L,. Let us find the spectral func- 
tion &,, where A must be assumed to belong to [0, a]. On observing that 


; Ori _ a—xz , n\_ [0 forA<z, 
ue i (Ga ee do = aye (arctan ae 3) = ores 
we obtain for any elements f(z) and p(x) [144]: 

a 
2 
(@,f,¥) = jim ge FU ear eat Me) v@) da] do = [f() y(2) de, 
i 6 
whence it follows that 
Z f(z) for «<A, aes 
M2) = 16 for a>. a 
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On taking /(x) = 1, we get the differential solution a(x, A) = 1 
for  < A and x(x, A) = 0 for x > A. It may easily be seen, on using 
(157) and property 11 of [52], that there are no solutions orthogonal 
to it. 

Let us consider the more general self-conjugate operator 


Bf(x) = w(2) f(z) , (158) 


where w(x) is real, measurable and bounded in [0, a]. The equation for 
the eigenvalues and eigenelements has the form [w(z) — A]f(z) = 0. 
Let K, be the set of z satisfying w(x) = A. If the measure of K; is zero, 
A is not an eigenvalue. If the measure of K, is greater than zero, A is 
an eigenvalue, and any complete system of functions, orthogonal on 
the set K,, is a complete system of eigenfunctions corresponding to 
the eigenvalue in question, where these functions must be assumed 
zero outside K,. If the measure of K, is zero for any A, operator (158) 
has a purely continuous spectrum. Its spectral function is given by an 
expression analogous to (157): 


f(x) for w(x) <A, 


(159) 
0 for w(x) >A. 


fy f(x) = 
Everything said may be readily extended to the case of functions 
of several variables. For instance, we can define a self-conjugate 
operator of multiplication by the independent variable: Af = 2,f, for 
functions /(2,, %, ...,%p,) that belong to Z, in some finite interval 
as KX <b, (s = 1, 2,..., 2). This operator has a purely continuous 
spectrum in the interval a, < A < by, and its spectral function is 
defined as follows: 


., @,) fora, < A, 


Bf (ey Xy). 0-5 €q) = { Oo (160) 


0 for x, > A, 

Let us return to the case of one variable. The operator of multipli- 
cation by the independent variable in an infinite interval is no longer 
bounded. We shall consider it below. If we take operator (158) and 
assume that w(x) is a bounded function in an infinite interval, we 
now get a bounded linear operator. We shall thus take as our basis 
space LZ, in (—9°°, +09), and let w(x) be a real, bounded and measur- 
able function in this interval. Now, (158) defines a self-conjugate 
bounded operator. If w(x) is continuous in the closed interval [—°, 
+00], the bounds of the operator coincide with the minimum and 
maximum values of w(z). 
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181. Kernels that depend on a difference. If we use operator (158) in the 
interval (— oo, -- 00), and pass with the aid of the transformation 7 to the 
unitarily equivalent operrators, bounded self-conjugate integral operators with 
kernels depending on a difference are easily formed. 

Let us outline the method. The unitary equivalent of (158): B’ = T*BT, 
is evidently given by 

+00 + 


B’ f(x) = a if low) f(t) o 9 ar| oY dy. 


Here and below, we shall simply write the integral with infinite limits instead 


of lim. Assuming that w(y) is summable in (— 0, +) as well as bounded, 
N—» oo 
and that f(t) of Z, is also summable, we can change the order of integration, 


and obtain 


+c $00 
B ja) = [ [= f oy) 9 ay] fe at 


or, on introducing the function 


+ co 
1 i lyu 
u) = —— | a(y)e™ dy = T* a, (161) 
g{u) ic (y) y 
we can write the operator B’ as 
+o 
I 

B f(z) = — ic — t) f(t) de. (162) 

V2x2 


As we know, the spectral function B’ is given by €, = T* 6,7, where €, 
is the spectral function of B. If the kernel satisfies condition (97) of [172], as 
will be the case in the following examples, i.e. 


+.00 
J lg(u) |du < + 0, (163) 
(162) is evidently applicable to the whole of Z,, and not merely to the f(x) 
which are summable in (— 0, +00), Let us consider some examples of this 
method. 
1. Let _ 2 
B(x) = Tangs (164) 
The bounds of the operator are m = 0 and M = 2. Given any A of the 
interval [0,2], the equation 2/(1 + 2) = A has not more than two roots and 
operator (164) has a purely continuous spectrum. 
The kernel of the operator B’ is given by 


2 a ef” 2 nT cos yu 
Pa | pall 2g ee, ep aes ay 
g(u) ¥2 fa ye dy jf Tay dy =V2xe 
and 


-b20 
B’ f(x) = J 0 ¥ fly) dy. (165) 
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The kernel obtained satisfies condition (94) of [172]: 
+ 2 - 
Fixe wildy = fo tay t+ foray = 2. 


By (159), the spectral function of operator (164) is defined thus: ¢, f(x) = 
= f(x) for 2/(1 + 2?) < A and 6, f(z) = 0 for 2/(1 + 2%) > A, ice. 


= f(a) for | x | > bk, 
Ef@) = Vo for [al<p, 
where uw = \(2 — A)/A, and 
—H +e +o 
ape) = 2a, tie) =s-( [ +f )[ [ me aeray, 
oF 


22 


i.e. 
+ fe 
, 1 —ity ixy 
& Ie) = — f(t) 0 dt] e”¥ dy — 
27% 
pee 
“iy ae —ity ixy 
- li [ i fit)e at] dy. 
ois —co 
The improper integrals with infinite limits must be taken in the sense of the 
mean square approximations. On changing the order in the last integral (the 
possiblity of this is easily proved), and remembering that 7* T = H, we get 
> = 1 sin p(x — t) 
SMe) =f(a)— | Ae 


f(t) dt. (166) 
The operator B’ has a purely continuous spectrum, and &; f(z) must tend 
{in the mean) to zero as A — Q, i.e. 


foo, 
ee ij SMe =) Hy dt = f(a). (167) 


Let us form the differential solutions for operator (165). It is easily seen that 
the homogeneous equation B’ f(z) = Af(z) has solutions cos wx and sin px 
not belonging to J,, i.e. 


+o +0 
§ e-Ix-¥l cos py dy = 4 cos pa; f e—Ix—yl sin uydy =A sin ux. 


On multiplying both sides by e-*dy/dA and integrating with respect to A 
from 0 to A, or what amounts to the same thing, with respect to 4 from p= oo 
to wt, we get the following two differential solutions: 

# 
(x, A) = [ e “cos yrdu = —e # ( 


oo 


cos ux zene) 
l+a? l+a? }’ 

(168) 
” 


- Sac __ gra ( eeosuae | sin pa 
mia, i) = | © sin we dys : ( lta t re 


eo 
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These functions belong to Z,, and it follows from the method by which 
they are obtained that they satisfy equation (129) and vanish for A = 0, i.e. 
for 4 = o. The factor e-“ is included so as to enable the integration to be 
performed from » = oo, and hence to obtain solutions continuous as far as 
4 = 0. Solutions (168) are mutually orthogonal [176] in the basic interval 
(— 0, + 0), since one of them is even and the other odd. 

Let us write down (120) and (121) of [176]. Simple working gives 


mu 
0,(A) = @,(A) = 37° poe 


and 


oop _= 
g(t) = JS [J e7* cos py dy] oy) dy, 


+oo 
g(t) = § [J e*sinpy du] p(y) dy. 


—oo 


The completeness of the system of solutions (168) can be proved with the aid 
of (128) of [176]. If we apply (123) to the real function g(x) of L, and the func- 
tion p(x), equal to unity in the interval (0, x] and zero outside it, we obtain after 
elementary transformations: 


x 
| e@) a= 
0 
if . +00 1 +o 
sin ux cos Ux : 
Slee ov (ES) Fe sn ona ae 
=I | 7 { p(y) cos uy dy + Fi mi p(y) sin wy dy | du 


which, given certain supplementary assumptions, leads to the ordinary Fourier 
formula. Notice that solutions (168) must be obtained by application of the 
operator &} to the functions 2,(z, 2), i.e. to 1/(1 + 2) and 2/(1 + 2*); this is 
easily verified. If we had not included the factor e~4 in the integration, and 
had integrated from p = 0, we should have obtained the simple differential 
solutions (sin px)/z and (1 — cos px)/x, which become meaningless with 
A = 0, their norms increasing indefinitely as A — 0. 

Let us consider the general case of transformation (162), on the assumption 
that g(y) is a real even function satisfying condition (163). Operator (162) 
is now defined throughout L, and is bounded and self-conjugate. We can form 


+oo 
y Bes Se itu . = 
G,() = if g(u) edu; F(t) 


—o0 


+00 
7 Jf feelaw, (169) 
M4 


where 


os 
1 
— du, 
| G(t)|< = i | g(u) | du 
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i.e. G,(¢) is a bounded function and F(t) ¢ Z,, so that G,(t) F,(é) € L,. It can 
be shown that (122) holds in this case, and can be written as 


jes $00 
y (x) =e foe-ofou= [@ (t) F, (t)e" “dt = TG, (t) PF, (t)], 
J Y2n . 


—co 


whence it follows that G,(t) F,(é) = T*[p(x)], and, on taking into account 
the second of formulae (169), (162) is seen to be the unitary equivalent of the 
operation G,(¢) of multiplication by the independent variable. 

We must mention a type of kernel that leads to a kernel dependent on a 
difference. Let K(x, y) be a real symmetric kernel in (0, 00), which is a homo- 
geneous function of degree —1. If we replace x and y in the integral operator 
with this kernel: 

+o 
p (2) = SK (a, y) f(y) dy (170) 


by the new independent variables x = e* and y =e, and replace (x) and 
fly) by o,(s) = eV? pe’) and f,(t) = eV f(e'), we get the integral operator 


++ co 
p(s) = J Ki (21) h (0) at 


with a kernel depending on |s —é|. For, K(x, y) = x7} K(1, y/x) since it is 
homogeneous, and we can write, on setting K(1, z) = w(z): 


as t= 
K,(s,t)=e2 © Sw(e5) =e 2 wot), 


where, in view of the symmetry of K(z, y), the last expression is an even func- 
tion of (tf — s). Since ds = daz/x, given our change of variables, space L, of 
functions in (0, ©) is seen to become space L, of functions in (— o, +0). 
The norm of operator (170) can be found directly with the aid of the follow- 
ing simple theorem: 

THEOREM. Jf K(x, y) is non-negative, homogeneous of degree (—1) and 


{K(@,l)e 2?de=fK(ly)y ?dy=k, (171) 
0 0 
then 


[Z| = <kilfll-ligll- (172) 


f i K (2, y) f (x) g (y) dx dy 
0 0 f 


Notice that, since the kernel is homogeneous, the integrals of (171) are equal. 
On rewriting the integrand as f(x) {K(x/y)"4. g(y) VK(y/x)"* and using Bunya- 
kovskii’s inequality, we get |I| < ~AYB, where 


1 


a= fle e[ [ x cw (-2)Fay |e =aii7ih 
0 0 


and similarly, B = k || g||?, whence (172) follows. It follows from (172) that 
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the norm of the operator with kernel K(z, y) does not exceed k. In particular, 
if we put K(x, y) = I/(x + y), since 


we obtain 


iecre Le@)atw) W dedy| <iifll-tlgll- 


On performing the above change of variables, it can be shown that the 
operator with kernel 1/(z + y) has a continuous spectrum in the interval 
[0, x]. 


182. Weak convergence. We have already investigated weak con- 
vergence in Z,. Let us recall the basic results for the case p = 2. If we 
consider L,(@), where @ is any fixed measurable set, the weak conver- 
gence ¢,(x) , p(x) is defined by 

lim { p(a) op (x He) (x) dz, (173) 


N+ & 


for any function y(z) € L,(@). The necessary and sufficient condition 
for weak convergence is as follows: (1) the norms || g, || are bounded 
in L,(@); (2) condition (178) is fulfilled on the set of elements y(z) € 
¢€ L,(&), the linear envelope of which is dense in L,(@). In the one- 
dimensional case, if # is a finite or infinite interval, the second con- 
dition can be replaced by 


lies: (axa jae = fp) 


N-roo € 


where c is any fixed number of the interval in question and ¢ is any 
number from this interval. 

The following can be proved: if a sequence of functions ¢,(z) of L,(f) 
is weakly convergent to some function g(z) and is convergent almost 
everywhere on & to some function w(x) € L{®), v(x) and w(x) must be 
equivalent. 


183. Other concrete forms of space H. In addition to l, and Lz, 
a number of other useful concrete forms of Hilbert space may be men- 
tioned. Let & be a measurable set of n-dimensional space and L, the 
space of functions, measurable and square summable on & , the measure 
being based on the Lebesgue measure or some other normal set 
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function. In the latter case, we have the Lebesgue-Stieltjes integral. 
Let us define space L,,,, a8 follows: An element of L,,,, 18 a sequence 
of m functions of Ly: (f,, fz, ---, fm), where f, € L, (k = 1, 2, ..., m). 
An element is zero if each of the f;, is equivalent to zero. Multiplication 
of an element by a number, and addition of elements, are naturally 
defined by 


(fis far +++ fm) = (Afr Ofer -+ +5 fm), 
(fis fa» Sy «rd m) F (91) 92» 2 ->Im) a (fy + 9ute + Jo: ae isdn + Omid 
and the scalar product by 


(ty) = J (F191 + fog. + .- + fim) do 


where dw is an element of #, in the case of the Lebesgue integral 
or the differential of the normal set function in the case of the 
Lebesgue-Stieltjes integral. It is easily verified that Z,,,, is a concrete 
form of separable Hilbert space. A linear operator y = Ax in Ly,m 
consists of m? linear operators Ay, (¢, k = 1,2, ...,m)in £,, with the 
aid of which the components of y are expressible in terms of the 
components of a: 


m 
i > Ain fas 
k=1 


This form of Hilbert space is a realization of an abstract con- 
struction of Hilbert space H from given Hilbert spaces H,, H,, ..., Hm. 
An element x of space H is defined as a sequence of elements (2, 2), 

.,2%m), where x, € H,. An element 2 is zero if all the z, are zero 
elements in the H;, (k = 1, 2,...,m). Multiplication by a number 
and addition of elements are defined by 


O (Ly, Lo, + +5 Lq) = (AX, AXy, ..-,ALm), 
(255 Vas se ee ney of (Yi, Yo, ied -+ Ym) = (ay + Yy, Xe + Yo, es «10m + Ym) 
and the scalar product by 


(2, y= = | (Ts Yu) + 


Every space W$)(D) [112] of functions g(x) belonging to L,(D), 
where D is a domain of n-dimensional space, and having generalized 
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derivatives up to order / that also belong to L,(D), is a complete Hilbert 
space with the scalar product 


(p.v) = ([e@y@ + S D*p(a),D*y(o)]dz, (174) 
D 


I<eSl 


where the summation is extended over all derivatives up to and in- 
cluding order J. It is assumed here that the domain is star-shaped with 
respect to any point of it, so that the property of generalized deriva- 
tives indicated in [111] holds. 

Let us consider space W$(D). Functions y(x) belonging to it have 
limiting values on the surface S of domain D (8 is assumed sufficiently 
smooth). It is easily seen that the set of p(x) € WD), satisfying the 
boundary condition 


p (x) |s = 0, 


is a complete Hilbert space with the scalar product 


Op (x) Oy (x) 
dar, . Oa, dz. (175) 


(?, y) = 5 > 
k=1 


D 


The set of functions of WY(D) with scalar product (175) and 
without any boundary condition is also a complete Hilbert space, if 
we identify functions whose difference is equivalent to a constant, i.e. 
regard such functions as the same element of the space. 


§ 3. Unbounded operators 


184, Closed operators. Let us turn to a consideration of distributive 
operators, which may not be specified throughout the whole of H, 
and which are not assumed to be bounded (to have finite norms). 
The notation to be adopted is as follows. Let A be a distributive 
operator, D(A) its domain of definition (which we shall always assume 
to be a lineal), and R(A) the range of values of A. It is also a lineal, 
since A is distributive. If A establishes a one-to-one correspondence 
between elements of D(A) and R(A), the inverse operator A-1is defined 
on &(A). 

The necessary and sufficient condition for the existence of A! is 
that the equation Az = 0 has only the zero solution (on D(A)) {127]. 
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Operators A and B are said to coincide (be equal), which is written 
as A = B, if their domains of definition coincide and if Aw = Bz on 
all elements of this domain. We say that operator B is an extension of 
operator A, and write ACB, if D(A) belongs to D(B) and Ax = Ba 
for « € D(A). The symbol ACB includes the possibility of A = B. 
If, when Ax = Bx for « € D(A), the lineal D(B) is strictly greater than 
D(A), we write Ac B. Notice also that (A + B)x = Ax + Ba has a 
meaning ifx € D(A) anda € D(B), whilst (AB)z = A(Bz),ifx € D(B) 
and Bz € D(A). Since we are not assuming that an operator is defined 
everywhere and bounded in norm, we cannot assert that it is contin- 
uous. However, an analysis of the fundamental properties that we 
have proved for bounded operators shows that many of the properties 
are consequences, not of the continuity of the operators, but of a 
weaker property — that of being ‘‘closed’’. We shall turn next to the 
definition and analysis of this extremely important property of linear 
operators. 

DeFinition. An operator A is said to be closed if the following condi- 
tion is fulfilled: if 2, € D(A) (n=1,2,...) and the sequences x, and 
Ax, have limits: tp => Lp, AX, => Yo, then x € D(A) and Ax, = Y¥,. 

If an operator is not closed, the question arises as to whether it has 
closed extensions. If there exist two sequences x, and 2, of D(A), having 
the same limit and such that Az, and Az), have different limits, the 
operator A evidently does not permit of closed extensions. But if, 
given the same limits for z, and 2, we never get different limits 
for Az, and Az;,, A admits of closed extensions, among which there 
is a minimal closed extension, which is usually denoted by A. Let us 
describe the formation of A. If 2, € D(A), t, => Xp, and Az, => yy, we 
include x, in the domain of definition of A and put Az) = yp. In 
view of the above-mentioned condition, A is defined uniquely. By 
using the triangle inequality, A is easily shown to be a closed ope- 
rator. This operation of extension of A is called closure of A. If B 
is any closed extension of A, it is easily seen that AS B. 

THEOREM 1. Jf A is a closed operator, and B is an operator bounded 
on D(A), A + B is also a closed operator; A-', if it exists, is a closed 
operator, and the set of solutions of the equation Ax = 0 is a subspace. 

All these assertions are proved directly from the definition of closed 
operator. 

THEOREM 2. If A permits of closure and has a bounded inverse A~} 
on R(A), A has an inverse A-} which is defined in the subspace R(A) 
and is bounded. 
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If R(A) is a subspace, A-1 is a closed operator. Otherwise, the 
bounded operator A-} can be extended from the lineal A(A) on to the 


subspace (A). Let B denote the bounded operator thus obtained: 


(R(B) = R(A)). The equation Bx = 0 is easily seen to have only a 
zero solution on R(B). Otherwise, there would exist a sequence 
2, € D(A) such that 2, = 0, whilst Az, = y# 0. But this contra- 
dicts the fact that A admits of closure, since, if we take x7, = 0 (n = I, 
2,...), then Az; = 0. The operator A, = B-1is obviously the closure 
of A. The theorem is proved. 

Note. It will be seen that, in the present case, the closure of A is 
uniquely connected with the extension in continuity of the bounded 
operator A-}. 

CoroLiary. Jf A is a closed operator and the bounded inverse A-} 
exists on R( A), R(A) is a subspace. 


185. Conjugate operators. We start from the following simple 
remark: if the element z is orthogonal to the lineal 1, dense in H, z is 
the zero element. 

For, let (z, z) = 0 if x € 1, and let y be any element of H. Since / is 
dense in H, there exists a sequence of elements x, of J such that 
Ln => y. By the property of 2, (Zp, 2) = 0, and in the limit (y, z) = 0, 
i.e. 2 is orthogonal to any element of H, and in particular, to itself, 
ive. (2, 2) = || 2? = 0, whence z = 0. If J is not dense, there obviously 
exists an element z orthogonal to 1. 

We shall always assume below that the operators are distributive. 

Suppose that the operator A is defined on the lineal D(A), dense 
in H. We form (Az, y), where x € D(A) and y is any element of H. 
There exist elements y such that (Az, y) can be written for any x 
of D(A) as 


(Ax, y) = (x, y*) (x€ D(A)), (I) 


where y* is an element of H. For instance, if y = 0, then (Az, 0) = 
= (x, 0) for any x of D(A). If the form (1) is possible for a certain y, 
the y* in this form is unique. For, if, for some y, we had (Az, y) = 
= (x, yf?) and (Az, y) = (x, y¥) for x € D(A), subtraction would give 
(x, y# — y¥) = 0, ie. y¥ — y¥ would be orthogonal to the lineal D(A), 
whence y* = y%. The set of elements y for which (Az, y) can be written 
in form (1) is obviously some lineal /*, and a distributive operator, 
transforming y to y*, is defined on this lineal. This operator is called 
the conjugate to A and is written as A*; thus y* = A*y and I* is D(A*), 
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and (1) can be rewritten as 
(Ax, y) = (t, A*¥y) (we D(A); ye D(A*)). (2) 


It follows from the above that the necessary and sufficient condition 
for the existence of A* is that the lineal D(A) be dense in H. As men- 
tioned above, A* is a distributive operator. We have the previous 
definition of A* for a bounded operator A. Let us now mention some 
properties of the conjugate operator. 

THEOREM 1. The operator A* is closed. 

Let x, € D(A*) and 2, => 2%, A*tn => Yo. By definition of A2*, 
we have (Az, 2) = (x, A*x,), where x € D(A), and, on passing to the 
limit, we get (Az, 2) = (x, yp), whence, by definition of A*, x, € 
€ D(A*) and A*x, = y). This is what we wished to prove. 

THEOREM 2. If D(A) and D(B) are dense in H and Ac B, then 
BtG A*., 

The lineal D(B*) is formed by the elements y for which, given any 
z€ D(B), we have (Bz, y) = (x, y*), where y* = B*y. But, since 
Ac B, it follows from (Bz, y) = (xz, y*) for x € D(B) that (Az, y) = 
= (x, y*) for x € D(A), ie. if y € D(B*), then y € D(A*) and By = 
= A*y = y*, and this means in fact that B*C A*. 

THEOREM 3. If D(A) is dense in H and A admits of closure, then 
{A} = AX 

We have (A) A, so that (A)*€ A*, and it remains to show that 
every element y of D(A*) belongs also to D(A*). By hypothesis, 
(Ax, y) = (x, A*y) for x € D(A), and it is enough to show that 
(Az, y) = (2, A*y) for x € D(A). If x € D(A), there exists a sequence 
@, of D(A) such that 2,=+> x and Az, => Az. By hypothesis, 
(Azp, y) = (ap, A*y) and in the limit (Az, y) = (x, A*y). This is what 
we set out to prove. 

THEOREM 4. If A* and (A*)* = A** exist, then AS A**. 

The lineal D( A**) of elements z is defined by the equation (A*y, z) = 
= (y, 2**) for y € D(A*), where 2** = A**z. But we have from the 
definition of A*: (A*y, z) = (y, Az), where y € D(A*) and z € D(A), 
whence it follows that AG A**, 

Since, by Theorem 1, A** is a closed operator, it follows from A& 
C A** that A admits of closed extensions, i.e. the existence of A** is 
a sufficient condition for A to admit of closed extensions. We shall 
see below that this condition is also necessary. Remember that the 
existence of A** is equivalent to the fact that D(A*) is dense in H. 
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THEOREM 5, If D(A) and R(A) are dense in H and the inverse A-} 
exists, there exist operators A*, (A~1)*, (A*)—1, and 


(A*)~ = (A7?)*. (3) 


The existence of A* and (A~-1)* follows directly from the fact that 
D(A) and R(A) are dense in H. Let x € D(A*; andy € D(A-!) = R(A). 
We have 

(x,y) = (x, AA~*y) = (A*a, Amly), 


whence it follows that A*x € D((A-})*) and 
(A~1)* A¥a = & (x€ D(A*)). (4) 


This shows that the equation A*z = 0 has only a zero solution (in 
D(A*)), i.e. the operator (A*)~! exists, and in addition, it follows 
from (4) that 

(ARS (A). (5) 


Now let « € D(A) and y € (D(A-)*). We have 
(2,4) = (Am! Ar, y) = (Az, (A~*)* y), 
whence (A-1)*y € D(A*) and 
A* (A-})¥y = y (y€é D((A-)*)). 
But it follows from this equation that 
(A-1)* € (A*)-1, 


which, in conjunction with (5), yields (3). The theorem is proved. 
The solubility of the equation 
AL y. (6) 


is bound up with the concept of conjugate operator. 

A closed operator A with domain D(A), dense in H, is said to be 
normally soluble if the necessary and sufficient condition for (6) to be 
soluble (not necessarily uniquely) is that y be orthogonal to the sub- 
space of solutions of 

A*¥z=0. (7) 


THEOREM 6. The necessary and sufficient condition for the normal 
solubility of a closed operator A with a domain D(A), dense in H, is that 
R( A) be a subspace. 

The operator A* is closed, and the set of solutions of (7) is some 
subspace J. It may readily be seen that all the elements of / are ortho- 
gonal to #(A). For, if y € R(A), then y = Az and (y, z) = (Az, z) = 
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= (x, A*z) = (x, 0) = 0. Thus, in view of the continuity of the scalar 
product, / is orthogonal to the subspace #(A). Let us now show that, 
if an element w if orthogonal to R(A), it belongs to J. In fact, (Az, w) = 
= 0 implies (Az, w) = (x, 0) = (x, A*w), i.e. Aw = 0 and w € 1. It 
follows from what has been said that the whole of H is the direct sum 
of two orthogonal subspaces 

H = R(A) @l, 


and the necessary and sufficient condition for A to be normally soluble 
is that R(A) coincide with R(A), i.e. that R(A) be a subspace. 


186. The graph of an operator. We can discuss, in addition to space 
H, the space H whose elements are pairs {x, y} of elements x and y 
of H, multiplication by a number and addition being defined in H by 


a {x y} = {ax, ay}; {21,44} = {x,, Yo} = {ay + X,Y, + yp}, (8) 
and the scalar product by 


({21,9,}, {x2, ¥2}) == (©, X_) + (Yy,Y2)- (9) 
All the axioms are easily seen to hold. If A is an operator in H, the 
set F(A) of elements {x, Az} of space H for x € D(A) is called the 
graph of operator A. All the elements of this set are uniquely defined 
by their abscissae (by the first element of a pair). Conversely, if all the 
elements of some set F of elements of H are uniquely defined by their 
abscissae, there exists in H an operator (not necessarily distributive) 
whose graph is the set #. The fact that A is closed is easily seen to be 
equivalent to the set F(A) being closed in H. If A is distributive, and 
defined on a lineal, F(A) is a lineal in H. As above, we shall in future 
only discuss distributive operators, defined on lineals. 
Let us define an operator U in the whole of H by the equation 


U {x,y} = {iy, — ix}. (10) 
It is easily seen that U is a unitary operator and that U-1 = U. 


Let A be an operator in H. We form the scalar product of an element 
of the set U F(A) with an element {z, y}: 


({idz,—iz}, {x,y})=i[(4z,2) —(z,y)] (e€D(A)). (12) 


Let A be an operator closed in H and D(A) be dense in H. Let us 
prove that H can be expanded into two orthogonal subspaces in 
accordance with 


H =UF (A) @ F (A*). (12) 
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If the element {z, y} is orthogonal to U F(A), it follows from (11) 
that (Az, x) = (2, y) for z € D(A), i.e. @ € D(A*) and y = A*z, or, 
alternatively, {x,y} € F(A*). Conversely, it also follows from (11) 
that, if {x, y} € F(A*), the element {x, y} is orthogonal to U F(A). 
It only remains to observe, in order to prove (12), that, since A and A* 
are closed, U F(A) and F(A*) are subspaces of space H. 

If A is not closed, but D(A) is dense in H, as above, we have instead 
of (12): 

H=UF(A)@F(A*). (13) 


Further, the difference H © U F(A) is a set @ of elements {z, y}, 
orthogonal to U F(A), or what amounts to the same thing, to U F(A), 
i.e. by (11), the set .# consists of pairs {z, y} satisfying the condition 
(Az, x) = (z, y) for z € D(A), so that the existence of the operator A* 
is equivalent to the fact that the elements of this set @ are uniquely 
defined by the abscissa 2. 

In view of the above, we have the following lemma. 

LemMa. The necessary and sufficient condition for the existence of 


operator A* is that the elements of the set 
H OUF(A) 
be uniquely defined by their abscissae. 
Let us now prove a theorem. 


THEOREM 1. If an operator A is defined on a dense set and admits of 
closure, there exist A*, A** and 


A** — 4, (14) 


Suppose first that A is closed. It follows from (12) and U-! =U 
that 
H = F(A)@UF (A*), 


i.e. the elements of the set F(A) = H © U F(A*) are uniquely defined 
by their abscissae, and, by the lemma, this set defines the graph of 
the operator conjugate to A*, i.e. of A**. But this set is F(A), whence 
At* =A, 

Suppose that A is not closed, but admits of closure. By what we 
have proved, (A)** = A. But, on the other hand [185]: (A)** = 
= ((A)*)* = (A*)* = A**, whence (14) follows. 

CoroLuaRry. If A* and A** exist, A admits of closure [185] and it 
follows from (14) that 


(A**)* — At** — A*¥, (15) 
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THeEoREM 2. If A is a closed operator and D(A) = H, A is bounded. 

It follows from the hypotheses and Theorem 1 that D(A*) is dense in 
H and A = A**. Let us first show that there exists a positive number 
N such that || A*z || < W || x || (2 € D(A*)). 

To do this, we consider the scalar product 


It defines in H a linear (bounded) functional 1,(y) for fixed 2 of 
D( A*). If 2, € D(A*) and a, => 0, the sequence of functionals L(Y) 
tends to zero on an element y of H, so that a positive number VN, 
exists, such that [100] 


IIx, (¥)| = (ys A*¥ 2,)| < Ny |lyll- (16) 


If the operator A* did not have a bounded norm, there would exist 
a sequence 2, € D(A*), for which x, 0, whilst || A*z, [| —> ©. 
But this contradicts (16), since, on putting y = A*z, in (16), we get 
[| Aten || <M, || A*aa |, ie. || A*tn || < N,. Thus || A* || << on 
D(A*). But A* can now be extended to the whole of H, and, since A* 
is a closed operator, D(A*) = H. The operator A** = A is conjugate 
to the bounded operator A’*, i.e. is itself bounded, and the theorem is 
proved. 

Notice also that, if 4 is a number and A* exists, (4 — AE)* = 
= A* — iE also exists. If B is a bounded linear operator, given on 
the whole of H, whilst A has A*, then (A + B)* exists and is equal 
to A* + B*, 


187, Symmetric and self-conjugate operators. We shall be mainly 
concerned below with so-called symmetric and self-conjugate operators. 
DEFINITION. An operator A is said to be symmetric if D(A) ts dense 
in H and 
(Ax, y) = (x, Ay) (17) 
for any x and y of D(A). 
It follows from (17) that any y of D(A) belongs to D(A*) also, and 
A*y = Ay for such y, ie. 
Ac A*. (18) 


A symmetric operator is said to be semi-bounded from below if 
there exists a finite 


m,=inf(Az,z) for w€ D(A) and |x|) =1, 
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whence it follows that 
(Az,x) >m,(x,2) (x€ D(A)), (19) 


and the number m, cannot be replaced by a greater one. 

If ma > 0, the operator A is said to be positive definite, whilst it is 
called positive if m, > 0 [cf. 126]. 

Notice also that, if the lineal D(A) is dense in H and (Az, 2) is real 
for all x € D(A), then A is symmetric, i.e. (Ax, y) = (x, Ay) for z and 
y € D(A). This is proved just like Theorem 2 of [124]. 

DEFINITION. A symmetric operator A is said to be self-conjugate if 
AY = A. 

It follows from what has been said that, to prove that a symmetric 
operator is self-conjugate, we only need to prove the following: if an 
element x € D(A*), then 2 € D(A). By (18), @ symmetric operator A 
admits of closure, and we have 


A = A**c A*; Atte = At — A*, (20) 


A self-conjugate operator is obviously closed. 

Notice also that, if A is a real number, the operator A — AE will be 
symmetric if A is symmetric, and self-conjugate if A is self-conjugate. 

THEOREM 1. If a self-conjugate operator A has an inverse A-1, the 
range of values R(A) of A ts dense in H and A- is a self-conjugate 
operator on RA). 

If the lineal #(.A) were not dense in H, there would exist an element 
z, different from zero, orthogonal to (A), i.e. 


or, what amounts to the same thing, (Az, z) = (x, 0). But this implies 
that 2 € D(A*) and A*z = Az = 0 for a non-zero element, which 
contradicts the existence of the inverse A-1. We have therefore shown 
that the lineal #(A) is dense in H. 

It follows from Theorem 5 [185] that (A-1)* = (A*)-, or, since A is 
self-conjugate: (A~1)* = A-1, ie. A+ is in fact self-conjugate. 

THEOREM 2. If there exists for a symmetric operator A a number A 
such that both elements of the form (A — AE)x and elements of the form 
(A — AE)x (x€ D(A)) fill the whole of H, then A is self-conjugate on D(A). 

We have to show that, if y ¢€ D(A*), then y € D(A). We have 
(Az, y) = (x, y*) for x € D(A), whence 


((A — 2E) a, y) = (@, y* — Ay). 
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By hypothesis, there exists at least one element 2 € D(A) such that 
y* — Ay = (A — AE)z, and we can write, in view of the symmetry 
of A: 

((A — AE) 2,y) = (2, (A — 4B) z) = ((A — 2B) 2, 2). 


But elements of the form (A — AE)x exhaust the whole of H, and 
it follows from the last equation that y = z € D(A). This is what we 
needed to prove. 

CoroLuary. If R(A) = H for a symmetric operator A, then A is self- 
conjugate. 

The proof only requires the application of Theorem 2 with 2 = 0. 

We shall prove in [189] that a proposition holds which is in a 
certain sense the converse of Theorem 2. 

In fact, if A is a self-conjugate operator and / is a non-real number, 
the operator A — AE has a bounded inverse, defined in the whole of H. 

THEOREM 3. If a symmetric operator A has an inverse A-}, bounded 
on R(A), then R(A*) = H. 

In view of the fact that the closure of a symmetric operator leads 
to a symmetric operator, and that A* = A*, we can assume for the 
proof that A is closed. On using Theorem 2 of [184], we can say that 
R(A) is a subspace. We have to show that, given any fixed y* € H and 
any x € D(A), the scalar product (zx, y*) can be written as (Az, y). 
On writing Ax =z, we get (x, y*) = (A-z, y*), and, since A-! is 
bounded on (A), the expression (A~1z, y*) can be regarded as a linear 
(bounded) functional /,.(z) on the subspace R(A). It can be written as 
(2, y) [123], where y € R(A), so that we get (x, y*) = (2, y) = (Az, y). 

This equation implies that any y* of H is expressible in the form 
A*y,. This is what we set out to prove. 

CorouuaryY. If A is a self-conjugate, positive definite operator, A-} 
exists, and is bounded and defined on the whole of H. 

Since A is positive definite, i.e. 


(Ax, x) >a(z,x2) (a> 0), 


we have a || x || < || Az ||, whence it follows that A-! exists on R(A), 
and its norm || A} || does not exceed a1. We can assert, on the basis 
of the previous theorem, that R(A*) = H, but R(A*) = R(A), so that 
A-1is defined on the whole of H. 

Notice, finally, that, given any symmetric extension A of a sym- 
metric operator A, we have AC A’; also, a self-conjugate operator 
does not admit of symmetric extensions. 
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Both these propositions follows directly from the definitions of 
symmetric and self-conjugate operators. 

THEroreEM 4. If A is a closed linear operator with a domain of values 
dense in H, the product A*A is a self-conjugate positive operator. 

The fact that A*A is positive follows from 

(A* Ax, 2) = (Az, Ax) >0 (x€ D(A* A)). 
The symmetry of A*A on D(A*A) is clear from the equations 
(A* Ag, y) = (Az, Ay) = (x, A*® Ay). 
Let us show that the equation 
(A*A+ E)a=y (21) 
is (uniquely) soluble for any y of H. We take the space H introduced 
in [186], and its decomposition into 
H = F(A) @ UF(A*). 
It follows from this that the element {y, 0} is uniquely expressible as 
{y, 0} = {x, Az} + i{A*z, — 2}. 
Consequently, y = 2 -+ 1A*z, z = —iAz, so that 
y=ax-+A* Ax (x€ D(A* A)), 
i.e. equation (21) has a solution for any y € H. Let us now show that 
D( A* A) is dense in H. If we assume the converse, a non-zero element 2 
must exist, orthogonal to D(A*A). By what has been said, it can be 
written as z= (A*A-+ E)x,, where xz, € D(A*A), and for any 
2 € D(A*A): 
0 = (2,2) = ((A* A+ E) a,x) = (a, (A* A 4 E) 2), 
and we obtain, on putting 7 = 2p, 
|| 0 ||? + (%o, A* Ay) = || Xo ||? + |] Aaa |!’ = 9, 

i.e. £, = 0, which contradicts the foregoing. Hence D(.A*A) is dense in 
H,i.. A*A and E + A*A are symmetric operators. In view of the fact 


that R(A*A + FE) =H, the operator A*A-+ E is self-conjugate 
(Theorem 2). This means that A*A is also self-conjugate. 


188. Examples of unbounded operators. The present section will be con- 
cerned with various differential operators, from the point of view of the general 
theory of operators. All the operators will be unbounded. The discussion will 
take place in the complex Hilbert space DL, of complex-valued functions of a 
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real variable. The formula for integration by parts, which will play a funda- 
mental role in what follows, has the same form here as in real space, viz. we have 
for any two functions g(x) and p(x) of WD) in the case of a piecewise smooth 
surface S of a domain D [113]: 


> ote) dz = — fo@ oe da + [ee p(x) cos(n,x2,)dS, (22) 
D Ss 
where n is the outward normal to S. 

We shall start with the simple differential operator D = id/dz. 

1. The operator D = id/dz in space H = L,f{0, 1). 

We have seen that, in the abstract theory, an operator A is specified by 
the domain of definition D(A) and by the rule for evaluating A on elements of 
D(A). 

Our present operator D can be uniquely defined on all functions of Z,[0, 1] 
having a generalized derivative in L,{0, 1]. But the D defined on so wide a class 
of functions will not have a number of properties possessed by D when it is 
considered say on finite smooth functions. We shall therefore start, in this 
and the subsequent examples, from an initial discussion of the differential 
operator on a set of smooth functions subject to certain boundary conditions; 
we study its properties, such as symmetry, positive definiteness, inversion 
etc., then raise the question of extending it whilst retaining certain properties. 

The choice of the domain of definition of the original differential operator 
is not unique. In order to emphasize this, we shall make different choices in 
the examples below. 

Let A denote the operator D, considered on the set Coro, 1] of all finite 
continuously differentiable functions on [0,1] (cf. the notation in [113]). 
The value of A on 9 of D(A) is calculated from the formula 


d(x) 
dx ’ 


Agp=i (23) 
and D(A) is dense in L,[0, 1]. 

The operator A is symmetric, since it follows from (23) that, for p(x) and 
p(x) € D(A): 


1 1 arta 
(4evv)= [iE 2) ya) ae — f ipa) HO az = (p, Ay). 
0 0 


The symmetry of A implies that A admits of closure, has a conjugate and 
A CA ¢A*. Let us find the functions that make up D(A) and D(A*). Let 
Qm(@) € D(A) and Qm (2) => 9(©), Am = Ym(e) => plz), then (x) ¢ D(A) 
and y(x) = Ap(z). 

It follows from the theory of generalized derivatives that this closure of 
A extends D(A) to D(A) = W[0, 1] [113]. 

Each 9(x) of WYI0, 1] is an absolutely continuous function, equal to zero 
at the ends and having a generalized first derivative of Z,[0, 1]. It could be 
shown that any such function belongs to WMO, 1]. The operator A on (zx) 
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of D(A) is calculated from (23), the only difference being that d/dz now denotes 
generalized and not classical differentiation. Let us now find the functions 
that make up D(A*). The function g(x) € D(A*), if there is a function p(x) € 
€ Z,[0, 1] such that (Aw, y) = (a, y), Le. 


1 1 


d -—— —— 
fe eau p(x) dx = fow p(x) dx (24) 
0 


0 


for all w(x) of D(A). But this implies [109] that p(x) has a generalized derivative 
dy(x)/dz, equal to —ty(x), i. that g(x) € € W/o, 1] and idy(x)/dae = y(x), 
where any 9(2) of wo, 1] satisfies (24) with p(x) = idp(x)/dz. We have thus 
shown that D(A*) = "Wio, 1] and A* » = idp(zx)/dw. It is clear that Wo, 1} 
is wider than wi) [O, 1]. Tt i is easily verified that A* is not symmetric on "D(A*). 

Let us show that a bounded inverse exists on R(A) (whence it at follow 


that R(A) is a subspace). Let g(x) € D(A). Then (x) = (1/i) J Ag(x) dz, 
and, by Buniakowski’s inequality, 


1 x 
Ii? ms Fi Ap(x) dz |? dx < ||Agil?. 


It follows from this that A~! exists on R(A) and || A~1|| < 1. 

We must discuss the possiblity of different self-conjugate extensions of the 
operator A. We know that, given any symmetric extension A, we have 
AcA& A*, ie. given this extension, we must add to D(A) elements of 
D(A*), and take Az = A*z on these added elements z. It should be borne in 
mind that the symmetry of the operator is not lost during the extension. 

Let us write H as 


H = R(A) QU. 


By the theorem of [185], U consists of the zeros u(x) of the conjugate operator. 
But A*u = idu(x)/dz, so that u(x) = const. We try to extend A in such a way 
that R(A) is extended to H and the operator remains symmetric. This is done 
by taking all the solutions of the equations A* y = const. and choosing those 
among them on which the operator D is symmetric. 

Obviously, g(a) has the form p(x) = C,(a -++ C). We choose constants C and 
C, so that D is symmetric on 9(z), i.e. so that 


0 = (Dp, ¢) — (p, Dp) = p(x) v(x) [E79 = dC,P (A + 0) +O) —C 0) = 
= iC, (1+0+40C}. 


Hence we have arbitrary C,, and C = —1/2 + £%, where f is an arbitrary 
real number. We associate with D(A) the elements g(x) = Ci(x —1/2 + 2), 
where # is a fixed real number, and C, is an arbitrary complex number. Let 
D(A) denote the set obtained, and 4 the operator D on it. It is easily seen that 
A is a symmetric extension of A. 

On the other hand, the domain of values of A is, by virtue of its construc- 
tion, the whole of H, so that A is a self-conjugate extension of A [187]. The 
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added elements of this extension are easily seen to satisfy the boundary condi- 
tion 
1 ' 
5th 
gy(1) = — p(0), i.e. p(1) = e% ~(0) (0< 6 < 2x). (25) 
ar + Be 
Obviously, the elements of D(A) also satisfy this condition. On the other 
hand, we have for any element g(x) of D(A*), satisfying condition (25): 


1 1 
| : — we) da = | nF gsi ees 
0 0 


da 


for arbitrary w(x) of D(A). Therefore such a g(x) ¢€ D(A*) and A* p = idg(x)/dzx. 
But A is a self-conjugate extension of A, i. A* = A, and D(A) therefore 
consists of all the elements of D(A*) that satisfy (25). It follows from what 


has been said that W[0, 1]= D(A) consists of all absolutely continuous func- 
tions g(x) that vanish at the ends of the interval and have d¢(x)/dz from Z,[0, 1]. 

We have formed all the possible self-conjugate extensions of A for which 
R(A) = H. Each of these extensions is defined by an arbitrary real para- 
meter 8, or what amounts to the same thing, by a real number @ varying be- 
tween the limits 0 < @ < 22. 

Let us also consider the possibility of self-conjugate extensions A’ of A such 
that R(A’) = R(A). 

If such an extension exists, the D(A) for it is filled merely by zeros of the 
operator A*, i.e. by elements 9(z) = const. The operator D on the set D(A) + 
-+ const. isa symmetric operator A’, which is an extension of A. The set D(A’) 
can be characterized by the fact that it consists of all the elements g(x) € 
€ D(A*) for which 9(0) = ¢(1). 

Let us show that D(A’*) = D(A’). Let g(x) € D(A”), ie. given any w(x) € 
€ D(A’), let 

0 = (A’a, ¢) — (@, y). (26) 


But D(A’*) S D(A*), so that 
1 1 


(x) 


d 
0 


g(x) dx = | w(x)i SEY ae + iw(x) p(x) a 
x F da x=0 : 
whence, in view of (26) and the fact that w(x) € D(A’), we have p(x) = idy(x)/dax 
and (0) = 9(1), i.e. g(x) € D(A’). We have thus shown that A’ is a self- 
conjugate extension of A. If we compare the boundary condition 9(0) = (1), 
to which the functions of D(A’) are subject, with condition (25) for our earlier 
extensions, it will be seen that the former corresponds to @ = 0 (or what amounts 
to the same thing, 8 = 09). 

We have thus exhausted all possible self-conjugate extensions of the opera- 
tor A. In addition to these, A has various non-self-conjugate extensions, but we 
shall not be concerned with them. 

The self-conjugate extensions A, corresponding to @ 40, have bounded 
inverses A-!. For, if Ay = idg(x)/dz = 0, then g(x) = C and, by (25), we must 
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have C = eC, ie. C = 0. Hence it follows that A-! exists, and, since it is 
defined on the whole of H and is a self-conjugate operator, it is in fact bounded 
[186]. 

The operator A’ has no inverse on R(A’) = R(A). 

2, The operator D = id/dxz in space H = L,{—oo, +o). 

Let A denote the differential operator D, defined on continuously differenti- 
able finite functions p(x). We can easily see that it is symmetric and that D(A) 
is dense in H. 

Let us consider the conjugate operator A*. The function p(x) ¢ D(A*), if, 
given any 9(x) € D(A), we have 


+00 +00 


[ SE ve@ae= foe yF@ae, (27) 


—co —co 


where y*(z) € L,(—co, +00), and p*(x) = A* p(x). But it follows at once from 
the first definition of generalized derivative that D(A*) is the set ww (—0o, +00), 
i.e. the set of functions of L,(—>0, +02), absolutely continuous on every finite 
interval, having a derivative in L, (—co, -+-co), and y*(%) = Dy(x). Let us show 
that, if p(x) € W)(—oo, too), then g(x) -+ 0 as a— + o. It follows from the 
obvious equation 


| o(2) |? = | (a) |* + f see SFO) ola) de + |e) 3B SF) a 


a 


and the fact that p(x) and dy(x)/dx ¢€ L, (—oo, +00), that | p(x) | has a finite 
limit as 2 —> -+ co and that this limit must be zero. 

Let us now investigate A**. The function y(x) € D(A**), if (27) is satisfied 
for every g(x) € D(A*), where yp*(x) € L,(—oo, + co) and p*(x) = A** y(z). 

On recalling that A « A* and A** ¢ A*, we can assert that every function 
p(x) of D(A**) must belong to D(A*), i.e. W0) (— co, + co), and A** y(x) = 
= Dy(xz). On the other hand, (27) is easily proved, by assuming that g(a) € 
€ D(A*), p(x) € WO) —co, +c) and y*(x) = idy(x)/dx. For, integration by 
parts over any finite interval gives 


6 
[i Sas) v(a) da = ic oa pou) de + ip(a) va) |. 


a 


On observing that p(x) and y(x) + 0 as x + + 09, we arrive at (27) in the 
limit as a— +c and b-+ — oo. It follows from what has been said that 
A** = A*, ie. A* is a self-conjugate operator. But we know that A** = A, 
so that the closure of A leads to the self-conjugate operator A*. 

3. The operator D = zd/dx in space H = L,(0, + 0). 

Let A be the operator D on all continuously differentiable functions, finite 
at infinity and in the neighbourhood of x = 0. It can be shown with the aid 
of arguments precisely similar to the above that A* is the operator D, D(A*) 
is the set of functions g(x) of H, absolutely continuous in any finite interval 
[0, a] with a derivative in L,(0, + <0) and D(A) = D(A**) is the set of func- 
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tions g(x) of D(A*) that satisfy o(0) = 0, where Ap(x) = idp(x)/dx. Hence 
D(A*) is wider than D(A), and A is not a self-conjugate operator, since (4)* = 
= A*, Let us show that A has no self-conjugate extensions. Let A be such 
an extension. We have: Ap = idg(x)/dz, since A ¢ A*, and D(A) must be 
wider than D(A). But we shall prove shortly that, if p(x) ¢ D(A), then (x) « 
€ D(A), and this contradiction shows that A has no self-conjugate extensions. 

Let p(x) € D(A), and hence (x) € D(A*). The formula (Ag, 9) = (9, A) leads 
us to the equation 


+ 2 + oo 
op AQ . dg(x) 
[ ae (x) dx = | cays dz aa; 
0 


0 


from which it follows, since g(x) + 0 as x — +00, that g(0) = 0, i.e. p(x) € 
€ D(A). 

4. The operator — d?/da? + q(x) in space £,[0, 1). 

Let ¢(x) be a real function, continuous in the interval [0, 1], and A the opera- 
tor [—d?/dz? -++ ¢(x)] on the set D(A) of all functions p(x) with the following 
properties: g(a) and dy(z)/dz are absolutely continuous for x € [0, 1], y(0) = 
= p(1) = 0 and d? 9(x)/daz? € D,(0, 1). It is easily shown that A is a symmetric 
operator and D(A) is dense in #. 

We shal] assume that q(x) is such that the equation —y” + g(x)y = 0 has 
no solutions vanishing at « = 0 and x = 1, apart from the trivial y= 0. Let 
us show that R(A) = H, whence it follows that A is self-conjugate. 

Let f(z) ¢ £,(0, 1). We have to show that a function of D(A) exists for 
which Ag(x) = f(x). 

We introduce the function 


x f 1 


t 
p(x) = —J LS f(x) dr] det af [§ f(r) dr] de, 
0 60 0 0 


which obviously belongs to D(A), where —y’(x) = f(x), and let w(x) be a 
solution of the equation 


— os" (a) + g(a) (a) = — g(x) pla), 


satisfying the condition (0) = w(1) = 0. Such a solution (with continuous 
derivatives up to the second order) exists [IV; 173]. It is easily verified directly 
that the function (x) = y(x) + w(x) belongs to D(A) and Ag(x) = f(z). 
This is what we wanted to prove. Hence A is self-conjugate. The operator A 
has a bounded inverse (IV; 173]: 


1 
A7* f(x) = — J G(x, t) f(t) de, 


where G(x, t) is the Green’s function of the operator A with the boundary 
condition (0) = w(1) = 0. The above condition, that the solutions of —y” + 
q(x) y = 0 vanish at = 0 and x = 1, is fulfilled if say q(x) > 0. 

5. The operator Dk = (i)k 0*/da;, ... Oz, in space H =L,(D), where D 
is a bounded domain in Ry. 
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Let A be the operator D* on all k-times continuously differentiable functions 
g(x), finite in D. We know that D(A) is dense in H, and it is easily verified 
with the aid of (123) of [109] that A is symmetric. The domain of definition of 
the self-conjugate operator A* will consist of all functions p(x) ¢ L,(D), having 
inside D a generalized derivative of the form D*, belonging to L,(D) (this follows 
from the definition of generalized derivative). It is clear that D(A*) is wider 
than D(A). A bounded inverse exists on f(A). For, let p(x) ¢€ D(A). We continue 
it by zero outside D and include D in the cube: —a < a; < a. We can now 
write p(x) as 

Xn Xtp 
g(t) =f... i) (— i) Ag(ay,...; fq) dz... dx, 


—a a 


whence, on using Buniakowski’s inequality, we easily obtain 


ello) < © [|Ag|lz qo) 


so that A-} exists on R(A), || A71]| < C and R(A) is the subspace H. As will 
be shown later in the theory of extensions of operators, there exists at least 
one self-conjugate extension for such symmetric operators. 


n 
6. The operator — A= — > 6*/da? in space H =L,(D), where D is a 
kel 


bounded domain of R,,. 

Let A be the operator — A, defined on all twice continuously differentiable, 
finite functions in D, i.e. let D(A) = (2D). The set C@)(D) is dense in H. 
If g(x) € D(A), then 


- 3 
J — Ap (x) Ga) da = Py es 
D 


3, dz, (28) 
i.e. A is positive, and hence mre In addition, we know from [114] that 
for all p(x) € D(A), 


(29) 


n 
<o| { 
elle) 2, az, 
D 
with the same constant C, depending only on the dimensions of D. It follows 
from (28) and (29) that 
1 Ilo) < C? || Ae Ile) » (30) 
i.e. A is positive definite, and the bounded inverse Aq! exists on R(A), with 
|| 4-2 || < OC? and R(A) = R(A). 

As will be shown later, such operators A admit of self-conjugate extensions, 
where each extension corresponds to some boundary value problem for the 
Laplace operator. Here we shall explain the structure of D(A) and D(A*). 
We take o(x) € D(A). We can easily show by integration by parts that 


Jian ? da = i, by 


2 
x, (31) 


from which it follows, if (28) and (29) are taken into account, that 


IP fwd) < Cr |] AP(Z) Hwy» (32) 
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where C, is a constant depending only on the domain. Now let 9,,(z) € D(A) 
Pm(x) => v(z) and Ag,,(x) =» Ap(x). It follows from (32) that the gm(z) 
now converge to 9(x) in the norm of WD), so that g(x) belongs to WD) 
and Ag = —Ag(x). We wrote We(D) for this completion of D(A) in (113], 


so that D(A) = WD). 
Now let (x) € DiA*), This implies that a function y(x) € H exists such that 


{ — sex) 9) dx = { w(2) ye) de (33) 
D D 
for all w(z) € D(A). We form the Newtonian potential 


anch vy) 
u(x) = ie Te—y [2 dy , (34) 


where k, is the area of the unit sphere in R,,. 
We know [T; 201] that, if y(y) is continuously differentiable, u(x) is twice 
continuously differentiable in D and 


— Au(e) = pla). (35) 


We show next that, if p(y) € L,(D), then u(x) = W®(D). To do this, we 
extend y(y) by zero outside D, form the average y,(y) and consider the func- 
tions 

1 ¥ol¥) 
=o | — ee : 
wl= Z| Tetye 


They are twice continuously differentiable and satisfy 
—~ Aug(x) == ¥,(2) (36) 


As @— 0, u,(z) and 0U,(x)/Ox, are convergent to u(x) and Ou(x)/dx, in the 
norm of Z,(D,) [115], where D, is any bounded domain. We shall assume that 
D lies strictly inside D,. By (36), we can say that, for v(x) = u,(x) — u,,(2), 


ave) Pez) de = es | Pola) — Yo-(x) |? L(x) der , (37) 
where ¢(x) is a fixed non-negative, twice continuously differentiable function, 


(such functions can be shown to exist) equal to unity in D, to zero outside D,, 
and satisfying everywhere the condition: 


Ch,(x) < Cz o(a) . (38) 
By using the formula for integration by parts, we can transform (37) to 


0? v(x) 
Ox; OX, 


2 
| C(x) da + 


J ¥o(t) — or(a) |? (a) da “12. 


x. 


4 Jz dv(x) eve) Ot{(x) = reves SO) a 
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Hence, on using inequality (38) and the inequality | 2ab| < e|a|*+ 
+ |6|?/e, valid for any ¢ > 0, we find that 


rm 
qm “t(a) dz < {ine ve(2) [2 Ox) da + 
pb k=l 


‘ 
+ MLZ, "| 2 


We take e = 1/2C,. It now follows from the last inequality that 


0? v(x) 
Ox; OX, 


de(x) 
Ox p 


8 v(x) j2 
Ox; Ox, | 


1 z 0? w(x) |2 P 
= [,2 | gets | $2) de < [IL v—(2) — vee) Bla) da + 
D, bn 
Ou(a) |? 
203 
e Jz Ory 


whence we can conclude that, as @ and e’ > 0, the function o(z) = u(x) — 
— u,,(x) tends to zero along with its first and second derivatives in the norm 
of L,(D). 

Consequently the limit function u(x) of u,(x), defined by integral (34), 
belongs to W{?(D) and satisfies equation (35). We therefore have for it: 


J — A e(2) u(x) da = J (a) y(e) de 
D D 
for any w(x) € D(A). On subtracting it from (33), we get the identity for p(x): 
s — A w(x) [p(x) — u(x)] da = 0. 
D 


We showed in [119] that this identity implies that the function g(x) — u(x) 
is harmonic, there being a corresponding identity for any function harmonic 
in D. 

We have thus obtained the following form for g(x) ¢ D(A*): 


vy) 
| tS ea y= 7 dy + o(2), 


where p(x) € L,(D), and v(x) is a function harmonic in D, and since g(x) and 
the integral belong to L,(D), v(x) € £,(D) also. It is easily shown that any func- 
tion g(x) of this form belongs to D(A*) and that 


A* p = — Ag(x) = (2). 


n 
7. The operator —4= — >’ 0?/dz2 in space H = L,(R,). We shall start 
k=1 


by considering the operator — 4 + AE, where A is any given real number, 
instead of — A itself. Let us take a positive A, say equal to unity. As will be 
seen below, this guarantees the existence of a bounded inverse of — A+ Z, 
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and thereby simplifies the problem of the self-conjugate extensions of the 
operator. 

Thus, let A be the operator — A + Z#, defined on twice continuously differenti- 
able finite functions g(7). We know that D(A) is dense in H, and it is easily 
seen that — A +- His symmetric. It follows from the discussion of the previous 
example that the elements of D(A) and D(A*) have generalized derivatives up 
to the second order, square summable over any finite domain D of &,. The 


n 92 
operator A* is evaluated on D(A*) as the differential operator — >’ - + (2). 

k=1 

Let us show that D(A) = W?(R,). Firstly, D(A) c W2(R,), * Stee: if 

Pm(x) € D(A) and 9px) => v(x), Ap, (x) => Ag(zx), it follows from (28) and 
(31) that 

2 

| dz = 


"| 8(G, (2) — Pm (2) 
{I a | Oay 

= |[— Ap, (t) — Gm (2) (G (®) — Pm (@)) + ACM (2) — Om (2) PF] da < 

Ra 


2 


OP(Q; (x) — Pm(2)) 
Ox; Oar, 


< [|A(% — Pm)|[LelRn)* |] — Pml|Le(Ra) + A) — Pm) ||? LR) > 0 


asl and m-—> oo. 


Let us prove the reverse inclusion, i.e. that WeXR,) © D(A). Let (x) € 
€ W@)R,). We form a sequence of functions Yp(2) = Q1jm(X) ¢m(x), where 
Piim(&) is an averaging of g(x) with radius 1/m, and ¢,,(r) is a twice continuously 
differentiable finite function defined for r > 0 as follows: 


1 for O< r cm, 
Cm (r) = 0 for r>m-+I1, 
Enai(r—l1)form<r<m+], 


and é,(r)< 1 is a smooth non-negative function, equal to zero for r > 1. 
Each y,,(x) € D(A). Let us show that y,,(x) —- o(x) as m — co in the norm of 
W)(R,,). On splitting into two parts the integration over F,, in the expression for 
the norm of WOXR,,): the part where |z|< 1, and the part where |x| > J, 
we obtain 


lo — Yellen) = Ile — Yell Pdxt<r + ll — Yer lw Gxatn 


Given ¢ > 0, the second term can be made < e/2 for all m, if we take L 
sufficiently large. This follows from || ? ||wixr,) < + cc, from the formation 
of the ¢,,(x) and the properties of the average of p(x) and its generalized deriv- 
atives. We fix 1 in the manner indicated. When m > 1, we have y,,(x) = 
= Piym(&) for | «| < 1 and 9 /7,(%) + p(x) in the norm of We (| 2] < 1). 

It follows from this that, for all sufficiently large m, 


£ 
lp — PmllwiP(ixicy < —g_- and ||p — Ymllwe Rn) < & 


The inclusion W®'(R,) S D(A) follows from what has been said, and so we 
can take it as proved that D(A) = W®(R,). Let us show that A = A*, ie. 
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that A is a self-conjugate operator. It is sufficient to show that the domain of 
values of A is the whole of L,(R,). We shall take n = 3 for simplicity. We take 
any finite continuously differentiable function g(x) € L,(R,). The function 


2d, fF Oe) 
Ase |x — y| 
Ra 


u(a) dy, 


is twice continuously differentiable, satisfies —Au + u—=g and decreases 
exponentially as |2|—- co together with its derivatives [IV; 231], so that 
certainly u(x) ¢ W®(R,). Consequently, (u(x) € D(A)), Au = and the domain 
of values R(A) of A is dense in H. 

Let us show that a bounded inverse A~! exists on R(A). It will follow from 
this that R(A) is a subspace, i.e. R(A) = H. 

Thus, let v(x) € D(A) and —Av(x) + v(x) = p(x). We multiply this equation 
by v(x) and integrate over R,, after which we use integration by parts: 
du{x) |* 
Ox, 


[vw r@ar= ff 5 
Ra 


=l 
Ra : 


+ [o(a)l? | dz. 


This gives us, in view of Cauchy’s inequality, 


loleen) < [llecra) = Allee) 


ic. A~! in fact exists on R(A) and || A7t!| < 1. 

This completes our proof that A = A*, or, what amounts to the same thing, 
that the closure of A amounts to its (unique) self-conjugate extension, where 
D(A) = W®(R,). This also holds for — 4 = A — E,i.e. the operator — A is 
self-conjugate on W(R,). However, as distinct from — 4 + EH, — A has no 
bounded inverse (it is easy to show that the inverse of — 4 exists, but is not 
bounded). 


189. The spectrum of a self-conjugate operator. It may be shown 
as in [128] that the eigenvalues of a self-conjugate operator are real, 
whilst the eigenelements corresponding to different eigenvalues are 
orthogonal, and a self-conjugate operator generates an orthonormal 
system of eigenelements. Notice that, since a self-conjugate operator 
is closed, the eigenelements corresponding to a fixed eigenvalue 
(including the zero element) form a subspace. 

The definitions of a regular point of an operator and of a point of 
the spectrum are the same as in [129]. We shall now prove, for a self- 
conjugate operator, analogues of the theorems of [129]. 

THEOREM 1. If A is not an eigenvalue of a self-conjugate operator A, 
the lineal R(A — AE) is dense in H. 

If A is real, A — AE is a self-conjugate operator, and the theorem is 
a consequence of Theorem 1 of [187]. Let 4 not be real. If we were to 
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have R(A — AE)# H, there would exist a non-zero element z, ortho- 
gonal to R(A — AE), ie. 


((A — AE) «,z) =0 or ((A — AE) 2, z) = (2,0). 


It follows from this that z € D(A) and (A — AE)*z = (A — JE)z = 
= 0, ic. Az = Az, which is absurd, since A can only have real eigen- 
values. 

THEOREM 2. The necessary and sufficient condition for A to be a regular 
point of the spectrum of a self-conjugate operator A is that there exist a 
positive number p such that 


|| (A — AE) x|| > pllx|| (e€ D(A)). (39) 


Non-real values of A are regular points of a self-conjugate operator. 

The proof is the same as in [129], except that we have to use the 
fact that A is closed, instead of its continuity. 

The following corollary may be obtained, as in [129]: points of the 
spectrum form a closed set. 

It will be seen later every self-conjugate operator has at least one 
point in its spectrum. 

Non-real A are regular points of the self-conjugate operator A, so that 
in future we shall only speak about real 4. In this case the operator 
(A — AE) is self-conjugate. 

Suppose that / is not an eigenvalue. If (A — AE) = H, the closed 
operator (A — AE)-1, defined on the whole of H, is bounded, i.e. / is 
a regular point. 

Conversely, if R(A — AL) is not H, it follows from Theorem 3 of 
[187] that (A — 4H#)-1 is unbounded on R(A — AE). We arrive at the 
following theorem: 

THEOREM 3. Let the real 4 not be an eigenvalue. If R(A — AEF) = H, 
then 4 is a regular point, and if the lineal R(A — AE) is not the whole 
of H (this lineal is dense in H), 4 is a point of the spectrum. 

We now take the case when A is an eigenvalue, and we write P, for 
the subspace of corresponding eigenelements (including the zero 
element). A unique operator exists, for which P, = H; this is the 
operator of multiplication by the number A: Ax = dz for all a € H. 
In the remaining cases the subspace P, is a regular part of H. 

Let us now show that P, is the set of elements x orthogonal to the 
lineal (A — AE)z (z € D(A)). 

For, since (A — AZ) is self-conjugate, it follows from ((A—A£)z, x)= 
= 0 that z € D(A) and (A — AE)x = 0, and conversely, if x € D(A) 
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and (A — AE) x = 0, then ((A — AE)z, x) = 0. It follows from this that 
the subspace Q,, complementary to P,, 


H=P,0Q,, 
is the closure of the lineal of elements y defined by 


y=(A—AE)z (z€D(A)), 


ie. Q, = R(A — AE). We write the element z of D(A) as z = 2, + 2, 
where 2, € P, and 2, € Q,. It follows from the definition of P, that 
2, € D(A) and Az, = Az, so that 2, = z— 2, € D(A) also, ie. the 
projection of the lineal D(A) on Q, is a lineal of D(A). Let us denote 
it by D,(A). This is obviously the lineal of the elements of Q, that 
belong to D(A). The operator A is defined on this lineal, and it may 
easily be seen that, if y € D,(A), then Ay € Q,. For, (y, z) = 0 by 
hypothesis, if x € D(A) and satisfies the equation Az = Az, and 
hence (Ay, x) = (y, Ax) = (y, Ax) = 0. By what has been said, we can 
regard A as an operator in the subspace Q,, which we can look on 
as a new space H. Let A, denote this operator, so that A, is defined 
on D,(A) and A, y = Ay if y € D,(A). Following the usual notation, 
we can write D(A,) instead of D,(A). 

It follows from a general theorem which we shall prove in [191] that 
D(A,) is dense in Q, and that A, is a self-conjugate operator in Q). 
By virtue of the actual construction of Q,, A cannot be an eigenvalue 
of A,, but it may be either a regular point or a point of the spectrum 
of this operator. In the former case, (A, — AE)z(z € D(A,)) transforms 
D(A,) into a complete space Q,, and in the latter case into a lineal 
dense in Q,. We can write (A — AE)z instead of (A, — AE)z. We have 
(A — AE)z = 0 for all z of Pj. 

The above discussion leads to the following classification of values 
of 4. 

I. Regular values of 4, which are characterized by the fact that 
R(A — AK) = H, and the existence of a bounded inverse (A — AE)-}. 

II. Values of 4 for which R(A — AZ) is a lineal different from H, 
and R(A — AH) = H. The inverse unbounded operator (A — AE)-} 
exists on A(A — AEH). We usually say that such values of A belong to 
the continuous spectrum. 

III. Values of A which are eigenvalues of A, and for which A, has 4 
as a regular point. For these values, R(A — AE) is a subspace, not the 
same as H. It is generally said of these 2 that they belong to a point 
spectrum only. 
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IV. Values of A which are eigenvalues of A and for which A, has / 
as a point of its spectrum. For these values, R(A — AZ) is not a sub- 
space, but is a lineal, the closure of which R(A — AE) is a subspace, 
not the same as H. We say of these A that they belong simultaneously 
to a point and continuous spectrum. 

Certain types of values of 4 may be absent in the spectrum of a 
self-conjugate operator A. But we have shown that the spectrum of 
a bounded self-conjugate operator contains at least one point. It is 
easily seen that the same is true for an unbounded self-conjugate 
operator A. For, suppose that (A — AE) has a bounded inverse for 
any real A. We have the obvious equation 


4 (A—AE) AA =+— A (440), 
where both sides represent a self-conjugate operator. It follows from 
this equation and the proposition in question that R( — A-) is the 
whole of H for any real yu, whilst this contradicts the fact that the 
bounded operator A-! has spectral points. 

It will be shown below that an unbounded self-conjugate operator 
has an infinite set of spectral points, distributed outside any fixed 
interval of the A axis. 


If A is not an eigenvalue of A, the operator 
Ry, = (4-15), (40) 


is called the resolvent of A, as we know. It is defined on R(A — AE) 
and transforms this lineal one-to-one into D(A). It follows from the 
definition of inverse operator that, if x € D(A) and R,z = 0, then 
x = 0 (cf. [144]). As in the case of bounded operators, we have 


Rae (41) 


If 2 is a real number, this follows from the fact that R, is self- 
conjugate. When A is complex, it follows from the equations (A — AE)* 
= A—JAE and [(A — A4E)-]* = [(A — AE)*]-1. If A and yw are 
regular values, it may be shown, precisely as for bounded operators, 
that [144]: 

R,— y= (u—A)R, Ry. (42) 


190. The case of a point spectrum. A self-conjugate operator A is 
said to have a point spectrum if the orthonormal system of its eigen- 
elements is dense in H. Let x, (k = 1, 2, ...) be this system, enumer- 
ated in some manner, and let 4, be the corresponding eigenvalues: 
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Ax, = Ay X,. By hypothesis, any element x € H is expressible by its 
Fourier series: 


T= Ae Oy Ly (43) 


THEOREM 1. The necessary and sufficient condition for x to belong to 
D(A) is that the series 


atl ae? (44) 
be convergent, and if this condition is fulfilled, 
Ax = > Ay dy Ly. (45) 
k=1 


If x € D(A), a Fourier coefficient of the element Az is (Az, x,) = 
= (x, Amy) = (a, Ay ty) = Ay ay, Whence follows the convergence of 
series (44). Suppose conversely that (44) is convergent, and let us 
show that x € D(A). Since (44) is convergent, we can form the element 


x! = SD Ay ay Ly (46) 
k=l 


and, on writing y, for a segment of series (43), we have y, € D(A), 
Yn => x2 and Ay, = 2’, whence it follows, since A is closed, that 
x € D(A) and Av = x’. The theorem is proved. 

We have already seen that a completely continuous self-conjugate 
operator has a purely point spectrum, where A, — 0 as k — oo for any 
enumeration of the eigenvalues. We must mention a further important 
case of a self-conjugate operator that has a point spectrum. Let A be 
a self-conjugate operator with a completely continuous inverse A~?. 
By definition of the inverse, the equation A-!z = 0 has no non-trivial 
solutions. The completely continuous self-conjugate operator A-1 has 
a point spectrum, and its eigenvalues py, (kK = 1, 2,...) can be 
enumerated in order of non-increasing absolute value: | 4, | > | us| > 
> ..., where u, #0 for all k, by what has been said above. On 
writing x, for the corresponding eigenelements that form an ortho- 
normal system (complete in H), we can write A-1x7, = pu, %,, whence 
it follows immediately that Ax, = A, xy, where Ay = 1/ uz. On recalling 
what was said in [136], we arrive at the following theorem: 

THEOREM 2. If a self-conjugate operator A has a completely continuous 
inverse A-1, A has a point spectrum, all its eigenvalues are of finite 
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rank and any finite interval contains only a finite number of eigenvalues 
of A. 

It follows from what has been said that the eigenvalues A; of such 
an operator A can be enumerated in order of non-decreasing absolute 
value: | A, | < |4,| < ..., where | A, |—> + 9° as k-> ©. 

Values other than eigenvalues may belong to the spectrum of an 
operator with a point spectrum. For instance, if Ais a point of condens- 
ation of A;, it must belong to the spectrum, since the regular points 
form an open set. Let us show that no other / can belong to thespectrum. 

THEOREM 3. If a self-conjugate operator A has a point spectrum, every 
real i, different from an eigenvalue and not a point of condensation of 
these values, is a regular point of A. 

There exists by hypothesis a positive number m such that |A,—A| > 
> m for all k. Let x € D(A). It follows from (43) and (45) that 


oo 


(A — 2B a) P= 3 


| Ay — A? fay |? > mm? 3 a, |? = m? || 2 |, 
k=1 k=1 


whence the point 4 must be regular. In other words, A has a purely 
point spectrum in the present case. 

If a self-conjugate operator A has no eigenvalues, we say that it 
has a purely continuous spectrum. In this case, instead of any element 
x of H being expressible as a Fourier series (43), it is expressible as a 
sum of integrals [cf. 149]. We shall show below that a point spectrum 
can be extracted from the purely continuous spectrum of a self-con- 
jugate operator, just as, in [189], we extracted one eigenvalue from 
the remaining part of the spectrum. Certain new concepts must be 
introduced in this connection. They are of interest in themselves for 
the theory of operators. 


191. Invariant subspaces and the reducibility of an operator. 
Before introducing the concept of invariant subspace, we must discuss 
the commutation of an operator B defined on the whole of H with an 
operator A defined on only part of H. 

DEFINITION. A bounded operator B, defined everywhere, is said to 
commute with an operator A when the following conditions are fulfilled: 
(1) tf x € D(A), then Bx € D(A); (2) tf & € D(A), then BAx = ABz. 

If A is bounded and specified everywhere, the first condition falls 
out, and we have the earlier definition of commuting operators. 

THEOREM 1. A necessary condition for B to commute with a self- 
conjugate operator A is that it commute with the resolvent R, for any regular 
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value of 4. A sufficient condition is that it commute with R, for at least 
one regular A. 

The commutation of B and FR, (with regular 4) is an ordinary com- 
mutation, such as was defined earlier (BR, « = R, Bx for x € H). 
Let B commute with A, A be any regular value and y any element of H. 
Now, &, y and BR, y € D(A), and we have in addition: 


ABR,y = BAR, y. (47) 


But (A — AE)R,y =y, whence AR, y = (AR, + E)y, and (47) 
can be rewritten as 


ABR, y =ABR,y + By; ie. (A —AE) BR ,y = By. (48) 


On applying #, to both sides of the last equation, we get BR, y = 
= R, By, and the necessity is proved. 

Now let a regular A exist such that BR, y = R, By. The form of 
the right-hand side implies that both sides belong to D(A) for any y¢ H. 
If y runs over the whole of H, x = R, y runs over the whole of D(A), 
and the equation shows that, if x € D(A), then Bx € D(A) also. On 
applying the operator (A — AE) to both sides, we get (48) and (47), 
whilst (47) can be rewritten as ABx = BAz for x € D(A), and the 
sufficiency is proved. 

CoroLuaRyY. Jf B commutes with R, for any one regular A, it commutes 
with R, for all regular 2. 

We now turn to the definition of invariant subspace. 

DEFINITION. A subspace Lis said to be invariant under the operator A 
if the following condition is satisfied: if x € D(A), and x € L, then 
Az € L also. 

If Lis a subspace invariant under A and D,(A) isa lineal of elements x 
belonging simultaneously to D(A) and L, A induces into L an operator 
A,, which is defined on D,(A) and is equal to A. The subspace L can 
be regarded as a Hilbert space (it may be finite-dimensional). As we 
shall see below, it is essential that not only ZL, but the complementary 
subspace H © L also be invariant under A, and that the projection of 
any element x € D(A) on L also belong to D(A). This leads us to the 
definition: 

DEFINITION. A subspace L is said to reduce an operator A when the 
following conditions are fulfilled: (1) L and M = H © L are subspaces 
invariant under A ; (2) if x€ D(A),the projection of x intoL belongsto D(A). 
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We shall in future write Px for the projector onto the subspace K. 


We have 
x= Prx+ Pyz. (49) 


If x € D(A), it follows from the definition that P, x € D(A), and 
hence Py x € D(A), i.e. if L reduces A, M also reduces A. Let A, and 
A, be the operators which A induces into Z and M. We now have, for 
any 2 € D(A), 

Az = A, (P,2) + A, (Py), (50) 


and we have thus split A into operators A, and A, acting in Zand M. 

THEOREM 2. The necessary and sufficient condition for the subspace 
L to reduce the operator A is that P, and A commute. 

Let us first prove the necessity. Let Z reduce A. By the definition 
of reducibility, if « € D(A), then P, x € D(A). It remains to show 
that P;, Ax = AP, x for x € D(A). We have (50), where A,(P;, x) € L 
and A,(Py, x) € M. On applying the operator P; to both sides of (50), 
we obtain 

P, Ax = A, (P,x) = AP,2, 


which is what we set out to prove. 

Sufficiency. Since P, and A commute, P, x € D(A) if x € D(A). 
It remains to show that, if « € D(A) and x€ LZ, then Aw € LZ and 
similarly for M. The first follows immediately from P, Ax = AP, z, 
the left-hand side of which obviously belongs to Z, whilst the right- 
hand side can be written as AP, x = Az, since x € L. The same is 
true for M, since A commutes with Py if 4 commutes with P,. 

We now turn to the case when A is self-conjugate. 

THEOREM 8. A sufficient condition for a subspace L, invariant under a 
self-conjugate operator A, to reduce this operator ts that x € D(A) 
implies P, x € D(A). 

We have to show that the conditions of the theorem imply that, if 
y € D(A) and y € M, then Ay € J also. 

We have for such an element y, and any z € D(A): (P, Ay, x) = 
= (y, AP, x). But AP, x € L by hypothesis, whence (y, AP, x) = 0 
and hence (P, Ay, xz) = 0 for any x € D(A). But the lineal D(A) is 
dense in H, whence P; Ay = 0, i. Ay € M; this is what we wanted 
to prove. 

Let D,(A) denote as above the projection of D(A) into L, i.e. the 
lineal of the elements of Z on which the operator A is defined, and A, 
denote the operator which is induced by A into L. 
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THEorEM 4. If a subspace L reduces a self-conjugate operator A, D(A) 
is dense in Land A, ts a self-conjugate operator in L. 

Let y be a given element of Z and « > 0 a given number. We have 
to show that there exists an element x2 € D,(A) such that || y — x || < 
< e. The lineal D(A) is dense in H, so that there exists an element 
z € D(A) such that || y — z!| < e. All the more, || Pp y — Piz || < «. 
But Pry = yand P, z € D, (A), and the first statement of the theo- 
rem is proved. It remains to show that, if 


(A,2z, y) == (x, y*), (51) 


for any x € D,(A), where y and y* ¢€ L,theny € D,(A) and y* = A, y. 
We can put « = P, z, where z is any element of D(A), and we get 
(AP, 2, y) = (Piz, y*) or, by Theorem 2, (P, Az, y) = (P; 2, y*), 
whence (Az, Py y) = (2, P, y*) and (Az, y) = (2, y*), since y and 
y* € L.Since A is self-conjugate, the lastequation shows thaty € D,(A) 
and y* = Ay = A, y, and the theorem is proved. We made use of 
this theorem in [189]. 

Let Ly, (tk = 1, 2,...) be mutually orthogonal subspaces and L 
their orthogonal sum [139]: 


TELE Ores 


THEOREM 5. If the mutually orthogonal subspaces L;, reduce a closed 
operator A, their orthogonal sum also reduces A. 

We shall prove this for the case of an infinite number of terms. We 
have to show that the operators P; and A commute. Let Q, be a pro- 
jector, equal to the sum of the first n of the P,,. If x € D(A), since 
the L, reduce A, we have Q, x2 ¢€ D(A) and AQ, x =Q, Az. But 
Qi.% = P, « and AQ,x=@Q, Ax => P, Ax, whence, since A is 
closed, P, x €D(A)and AP, x =P, Az, which is what we wanted to 
prove. Let A be a self-conjugate operator, A, be distinct eigenvalues 
of it, and L, the corresponding subspaces of eigenelements (including 
the zero element). The number of these subspaces may be finite. Each 
L, obviously reduces A. We form their orthogonal sum L. If LE is the 
whole of H, A has a point spectrum. If this is not the case, we have 
the orthogonal decomposition of H: 


H=L@M and c=P,r+ Pye (c€ Hf), (52) 


where L and M reduce A, and this operator induces into ZL and M the 
operators A, and A, such that Ary = A,(P; x) + A,(Py x), where 
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A,(P, x) = A(P,_ x) and A,(Py 2) = A(Pyx) for x € D(A). The 
operator A, has a point spectrum in Z,, whilst A, has a purely contin- 
uous spectrum in JM. 


192. Resolutions of the identity. The Stieltjes integral. We now turn 
to the theory of spectral functions (resolutions of the identity) for 
self-conjugate operators. This is largely analogous to the case of a 
bounded self-conjugate operator. We shall emphasize the details where 
the unboundedness of the operator has to be taken into account. 

We define a resolution of the identity as a family of projectors 
@,, depending on a real parameter A in the interval (—°o, +00) and 
satisfying the following conditions: (1) if w > A, then %, > @; (2) 
@, tends to the annihilation operator as 4A—» —c and 3,—> E as 
A -» +00; (3) #, is continuous from the right, i.e. F, > Fy, asA—> 4’ + 
+ 0. Here, 7,7, = &, 8, = @, for A < yu, and if A is some interval 
[a, 8], we have as before, on writing 42, = &, — &@,: 


A',x | A’@,2, (53) 
(A’ and A” have no common interior points) 
A'S, - A", = Ae, (54) 


(A, is the common part of 4’ and 4”). 
Let 6 be a subdivision of the interval (—°o, + °°): 


lite eS Any As dys Ay dy ony 


where the upper bound w, of the differences A, — 4,-, (k = 0, £1, 


+2,...) is finite. We form the infinite sum 
too foo 
pes Ar ASX aa ee AKO a _ Fi.) x, (55) 


k=a—oco k=—o00 


where A,_, < Aj < A, and x is an element of H. By (53), this sum con- 
sists of mutually orthogonal elements and the necessary and sufficient 
condition for its convergence is that the series [121] 


+o foo 
> Aj? || AF 42 ||? = = NA, || Fx ||? « (56) 


k=—00 k==— 00 


be convergent. 
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This series is the sum @; [3] for the integral 
oo +o0 
{ 2d(%, 2,2) = f Rd Z,2([7, (57) 


and we know from [5] that, if series (56) is convergent for some sub- 
division o and some choice of A;, it is convergent for every subdivision 
and every choice of Aj. The limit of sum (56) as w,;—> 0 is equal to 
integral (57) and the existence of (57) as an improper integral is equi- 
valent to the convergence of sum (56). We are therefore justified in 
considering sums (55) for the elements x for which series (56) 1s con- 
vergent, or what amounts to the same thing, for which integral (57) 
has a finite value. Let J denote the set of such x. On observing that 


[| 4 (@ +)? < (| Fay || + || 4 Fry P< 2 | 4H, 2|[2+-2 | AF ylPs 


we can say that, if x € land y € 1, then x + y € I. In addition, it is 
obvious that, if  € / and a is a complex number, then az € I, i.e. J is 
a lineal. If x belongs to the subspace onto which the operator 7; — #4 
projects, the terms of sum (56) for which A,_, > 8 or A, <a are zero, 
i.e. such x belong to J. On observing that &; — &, > EH as a— —oo 
and B+ +c, we can say that the lineal / is everywhere dense in H. 
Further, if «¢7, we can show, precisely as in [141], that sums (55) 
have a definite limit in the sense of a convergence in H as w,— 0. 
This limit is naturally written as a Stieltjes integral, and it defines on 
the lineal / a distributive operator Az: 


4-00 
An = { Ad&x. (58) 


Let us write D(A) as usual for the lineal /. We recall that it consists 
of the elements x for which integral (57) has a finite value. On forming 
the scalar product of sum (55) with itself, using (54) and passing to 
the limit, we obtain 


Veal eel? =| 42 \P @€ D(A) (59) 


where the integral is the limit of sums (56) as w,—> 0, or it can be 
understood as an improper integral with infinite limits. On forming 
the scalar product of (56) with any element y and passing to the limit, 
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we get the expression for a bilinear functional: 
+00 
(Az,y) = J 4d (8, 2,y) (60) 


(c€ D(A); y€ H), 


and the integral here is the limit of the corresponding sums gy as 
ws —> 0. If we replace y by (@, — &.)y, we obtain, by (54): 


8 
(Az, (F,—&,)y) = i Ad (€,2, y); 
and, on passing to the limit as a-+ —co and B-—> + ©, we have 


B 
(Az,y)= lim | Ad(@,z,y), (61) 

a 
i.e. integral (60) can be understood as an ordinary improper Stieltjes 
integral, where (@, x, y) is a function of bounded variation. 

Notice that the infinite interval (—°co, -+°°) has a finite measure 
with respect to the non-decreasing function || %, x ||?, and we can 
interpret (57) as the Lebesgue-Stieltjes integral of an unbounded 
non-negative function 2? over a set of finite measure [50]. 

Let x be any element of H. Now, (@; — @,)x belongs to the sub- 
space onto which (@; — @,) projects, and it follows from this, as we 
saw above, that (#; — %.)x € D(A) for any choice of element x. This 
assertion does not hold for the element %, x. But, if x € D(A), ie. 
series (56) is convergent, the fact that || 4, 23(@, x) || == || @, 4 Fi || 
< || 4, %, x || implies that series (56) is convergent when z is replaced 
by @,, a, ie. if x € D(A), then @, x2 € D(A) for any uw. If x is taken to 
belong to D(A), we can replace x by #, x in sum (55), taking pu as one 
of the points of subdivision (4 = A,). All the terms with k > p now 
vanish, whilst the terms with k < p remain invariable, and we obtain 
in the limit 


“ 
A€ x= [ dey. (62) 


This integral is the limit of a sum of form (55) when the interval 
(—°°, #) is subdivided. On the other hand, if we apply to the sum (55) 
the operator 2, which is bounded and therefore continuous, what has 
been said above about the terms remains in force, and, in view of the 
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continuity of %,, we get in the limit as w,;—> 0: 
B 
& At= j Ad x (62,) 


(x€ D(A)); 
on comparing with (62), we can write 
Ax = A®,x (63) 
(x€ D(A)). 


Similarly, we have for any z: 
& 
(x€ Af). 
If x € D(A), it follows also from (62) that 
B 
(6, —&,) Av = { Ad&,x (64,) 
(v¢€ D(A)), 
and we obtain, on letting a> (—°°) and > +0: 
B 
§ AdF x > Ax, (65) 


i.e. integral (58), like (57), can be interpreted as an improper integral. 
It follows at once from the above formulae that 


B 
(A¥,2, y) = (B,Az,y) = | 2d (F 2,9) (66,) 


(c€ D(A); ye H); 


B 
(A (Sp = F,) L,Y) = 5 Ad (B42, y) (66) 
(ce H; y€ A). 


If y as well as z belongs to D(A), we obtain on substituting y instead 
of x in (58) and forming the scalar product from the left with z: 


+00 
(x, Ay) =) Ad (x, Fy). 
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On comparing with (60) and noting that (7, ®,y) = (@, 2, y), we 
get (Ax, y) = (x, Ay), ie. A is a symmetric operator. Let us now show 
that A is self-conjugate. It is enough to show that, if z € D(A*), then 
z€ D(A). Let 6 be a subdivision of (—°co,-+°°) and P; the sub- 
space defined by the projector bx, —_ oi, Ifz¢ Pj, all the terms in 
sums (55) and (56) vanish for k >j+1 and k < —j, the element 
z € D(A) and, when 7 + 1 > k > —j, the subspaces defined by the 
projectors 4; %, belong to P;, so that sum (55) and its limit Ax belong 
to P;. Hence it follows that Ax € D(A) and that A’ € P;. Suppose 
that z € D(A*); let us show that z € D(A). Let z be the projection 
of z onto P;, so that 2; € D(A), ie. 2; € D(A*). The element (z — z,) 
also belongs to D(A*) and is orthogonal to P;. By definition of A’, 
(A?z;, 2 — 2) = (Az;, A*(z — z)). But A’, € P; and z— 2; is ortho- 
gonal to P;, so that 

(Az, A™ (2 —= 2;)) = 0. 
On taking into account the obvious equation 
(A*z, A*z) = (A*(z—2z,), A*(z—2,)) + (A*2,, A*z,) + 
+ (A* (z —2,), A*z,) + (A*2;, A*(e~2)), 
the previous formula and A*z; = Az;, we obtain 


{| A*z |? = || A* (2 — 2) |? + |] 42), (67) 
whence 
|| Az, |? < || A*z |. 
Let us consider sum (56) with x = z,. Allits terms vanish for k > 7 + 
-+ land k < —j, whilst for the remaining terms, by (54): 
AE 42; => (22, — Sx.) (Fx, a Fr,) 2% = (F2,— Bx.) = A,@ 4Z. 
Thus (56) gives in the limit, in view of (59): 
ay 
|| Az; | = J Vd || 22) ), 
and, by (67), 
ay 
§ #d || Fz |? < || A*z|P. 
ay 
If we let j increase indefinitely, it will be seen that integral (57) has 


a finite value for x = z, i.e. 2 € D(A), so that A is a self-conjugate 
operator. 


“1 
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The above discussion leads to the following theorem: 

THEOREM 1. For every resolution of the identity @, there ts a corre- 
sponding self-conjugate operator A, defined for all the elements x for 
which integral (57) has finite values. The operator A is defined as the 
limit of sums (55), or what amounts to the same thing, as integral (58). 
The corresponding bilinear functional is defined by (60). 

The converse can be proved. 


THEOREM 2. Given any self-conjugate operator, there exists a resolu- 
tion of the identity @, such that A is expressed by (58). 

The proof of this theorem will be given later. We shall prove below 
a formula that gives %, in terms of A [cf. 144], where distinct A cor- 
respond to distinct Z,. The operator @, is called the spectral function 
of the self-conjugate operator A. It can be shown, precisely as in 
[144], that regular points 4 = yw of the spectrum are characterized 
by the fact that an interval exists, having yu as an interior point, 
in which %, is constant. We have to bear in mind here that (%,; — 
— @,)a € D(A) forany z € H and any finite a and f£. The eigenvalues 
A = » are characterized by the fact that ®, has a jump at 4 = », 
the difference 2, — &,_») being a projector onto the subspace of cor- 
responding eigenelements (including the zero element) [145]. 

Let the self-conjugate operator A be semi-bounded, and let ma, 
be its strict lower bound: 


m, == inf (Awv,x) for x€ D(A) and |/a||=1. 
We have, given any real A, 
((A — AE) x, x) > (my — A) (2, 2) (x€ D(A)), 


whence 
|| (A —- AE) |] - |] eI] (m, — A) |e]? 
and 


| (A — AE) x|| > (m4 — A) || aI. 


It follows from this that all the A satisfying the condition A < ma 
are regular points of A. Let us show that 2 = mg, is a point of the 
spectrum of A. If this were not the case, there would exist an m, > ma 
such that all the values 2 < m, are regular points of A, and &, is the 
annihilation operator for 2 < m,, so that 


oo 
(Az, 2) = { Ad (&,2, 2) (x€ D(A)), 


whence it follows that (Az, x) > m,(x,x) for 2 € D(A), and this 
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contradicts the definition of m4. The value A = mz, is also known as 
the lower bound of the spectrum of A. 

The operator @, is called the spectral function of the self-conjugate 
operator A. We shall now re-iterate briefly the properties of general 
self-conjugate operators, which are precisely similar to those of bounded 
self-conjugate operators. 


193. Continuous functions of a self-conjugate operator. Let f(A) 
be a bounded function, uniformly continuous in (—°%, + °°) (say 
f(a) is continuous in the closed interval [—°c°, +°°]). We form the 
sum, analogous to (55): 


poo 
a LA) A ee, (68) 


k= —~oo 


where x is any element of H. It may easily be seen that this series, 
consisting of mutually orthogonal elements, is convergent for any «. 
For, | f(A) | < & by hypothesis, where & is a definite number and 


S fA, 8,2 


k=—oo 


2 
= 5S | f (Ak) P| Ag ax [P< = || Anz |? = 
k=~—co k=- 
= Ia? (69) 


and the series analogous to (56) is therefore convergent. It can be 
shown, precisely as in [141], that, given any x of H, sum (68) has a 
definite limit as w;-—> 0. This limit yields a distributive operator 
F(A), defined throughout H. It follows at once from (69) that 
\| f(A) @ |] < & || v ||, ie. f(A) is a bounded operator. It is natural to 
write the limit of sum (68) as a Stieltjes integral: 


fAye= [faasx (70) 
(€ H) 
f(A)2,y) = ii d (Fz, y) (71) 
(re AH; ye A). 


The latter integral can be interpreted as an ordinary Stieltjes 
integral, such as we defined in [4]. 
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We have the exact analogue of (62) and (63): 


B 
&,f(A)e=f (A) a mai (A)d Fx (72) 


B 


(Bf (A)ny) = (f(A) Bey) = [HAAS e.y) (73) 
(w€ H; y€ H). 


We could have applied the above definition of /(A) to the case of 
any bounded function, continuous in (—°°, ++ °°), without requiring 
its uniform continuity. We shall show later that the concept of func- 
tion of a self-conjugate operator can be extended to a wider class of 
functions f(A). 

We have the obvious formulae for an operator (A — /#), where 
Lis any given number: 


+ co 


A—IlE)« =f 4—) deg; ((A —1E) )ay=f (A—1)d (2,2, y) 


= (74) 
(re D(A); ye). 


194. The resolvent. Let us find an expression for the resolvent in 
terms of the spectral function. 

If J is not real, 1/(A — 2) is a function of 4, continuous in the closed 
interval [—°o, +-°co], and we can form the bounded operator 


Rx = fra 7 ae 2. (75) 


Let us show that this has all the properties of the resolvent when 
4 =1, which will justify us in writing it as R,. Given any 2, the 
element (%; — &,) R, x € D(A), and we have by (66,): 


B 
((A — lE) (Sy ee G4) yw, y) = i) (A == l) d (%, Rix, y)- 
On the other hand, by (73) and (75): 


a 
] 
(FR, y) = | ra d(Z,,2,y). 
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On substituting in the previous formula and using the properties 
of the Stieltjes integral [9], we obtain 


B 
((A —1E) (€, — &,) Ry, y) = fd (Fz, y) = ((F,— &.) 2, 9), 


whence, since y is arbitrary, 
(A —lE) (6, —@,) hw = (@, — @,) &. (76) 


Let us form the two number sequences a, and §,, where a, > —° 
and f,—»-+°°, and the element sequence y, = (2, — @.,) R, x. 
We have yn, € D(A), Yn => &, x, and, by (76), (A —JE) y, > 2. 
Hence it follows, since A is closed, that R, x € D(A) for any x and 
(A —lE) 2, c=. 

It remains to show that 2,A — lE) x = <2 for x € D(A). This fol- 
lows at once from the formulae 


oo 
(R, (A — UB) x,y) = [ q~, d(8,(A — 1B) 2,9) 


oO 


and 
a 


(Z,(A — 1B) a,y) = | (u—Ad(Z,0,9), 
which are consequences of (71) and (66,). 
The formula defining the spectral function in terms of the resolvent 
remains in force: 


A 
(Ewe) + (B,2,y)] = lim [ (Resa — Rea) 9) do. (77) 


A definite spectral function @, corresponds to a self-conjugate 
operator, and the operator is bounded when and only when @, is 
variable only on a finite interval. 

We have shown [191] that the necessary and sufficient condition 
for a bounded operator B, defined everywhere, to commute with 
a self-conjugate operator A is that 


BR,=R,B (78)" 


for any J, for which the resolvent exists. Let us now prove the fol- 
lowing theorem: 

THEOREM. The necessary and sufficient condition for B to commute 
with A is that, given any real 4, we have 


BE,=6@,B. (79) 
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It is sufficient to show that conditions (78) and (79) are equivalent. 
By (75), we have for any elements 2 and y: 


(BR x, y) = (Rx, B*y) =fh d (%,2, B¥ y), 


(80) 
(RyBz,y) = [ -48,Ba,9). 7 


If condition (79) is fulfilled, the right-hand, and therefore the left- 
hand sides of equations (80) are the same, and, since 2 and y are 
arbitrary, condition (78) is fulfilled. Conversely, if (78) is fulfilled, 
(BF, 2, y) and (2, Bz, y) can only differ by a constant or at points 
of discontinuity, since the inversion of the Cauchy-Stieltjes inte- 
gral [29] is unique. But both the functions tend to zero as A» —co 
and are continuous from the right at points of discontinuity, so that 
we have (B@, 2, y) = (@, Bu, y) for any z and y, i.e. condition (79) 
is fulfilled, and the theorem is proved. 

Notice that, by virtue of the results of [156], %,, commutes with 
A for any yu. This also follows from the last theorem. It also follows 
from this theorem, together with (70), that f(A) commutes with A. 


195. Eigenvalues, As already mentioned, 4= 2 is an eigenvalue 
of A when and only when @, has 4’ as a point of discontinuity, and 
here, 0,,— @,-—, is the projector into the subspace of correspond- 
ing eigenelements (including the zero element). On assuming H 
separable as usual, we can say that the number of eigenvalues, if 
there are such, is finite or denumerable. The rank of an eigenvalue 
is defined as before, and it can be assumed that the set of all eigen- 
elements forms an orthonormal system 2, 2%, ... Let A, be the points 
of discontinuity of %,, LZ, the subspaces of corresponding eigenele- 
ments, and P,, = &,, — &,,-, the projectors into these subspaces. 

We form the orthogonal sum 


HST Oi <x (81) 


The subspace H’ reduces A, and the operator A’, induced by A 
into H’, is self-conjugate and has a purely point spectrum. 
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If the operator already has a purely point spectrum in H, then 


CoN = > (Fa ae Fn.) . (82) 

AggA 
Let 2, %,... be any complete system, orthonormal in H, and 
fy, fg, +». & Sequence of real numbers, some of which may be the 


same, “, and 2x, being described as corresponding. Further, let 
A,, 4g, ... be different ys, L;, the subspace formed by the z, which 
correspond to the py, equal to A;,, and P,, the projector onto Ly. 

We define the projector 

by = J Py. 
MBSA 

This is a resolution of the identity, which corresponds to a self- 
conjugate operator C with purely point spectrum. Its eigenvalues 
are A, and 2, %, ... is the complete set of eigenvalues. If all the A, 
belong to a finite interval, C is a bounded operator. 


196. The case of a mixed spectrum. First, some remarks supplem- 
entary to what we said in [191] about the decomposition of a self- 
conjugate operator A into operators with a purely point and a purely 
continuous spectrum. Let some subspace H’ reduce A. It now reduces 
@,, and, on writing A’ and @j for the operators induced by A and 
2, into H’, we can say that 2; is a resolution of the identity in H’ 
and 


Aln= f 2d8{e (83) 
(x€ D(A’), 


ie. ; is the spectral function of A’. Let A” and 37 be the operators 

induced by A and @, into the subspace H” = H © A’. Ifx =v’ + 

+2” and y=y’+~y” are decompositions of x and y onto H’ and 
H", where x € D(A), we have [191]: 

Ax 25 A’? 4. A” x”; Ory —_ Fy’ 4+ ory" 

(An, y) = (Av’,y’) + (Av’, y"). (84) 

Similar formulae hold in the case of a finite or infinite number of 

mutually orthogonal subspaces reducing A. We now return to the 

notation of [191] and suppose that H’ is not the whole of H, The 


operator A’ has a purely point spectrum in H’, whilst A” has a purely 
continuous spectrum in H”. Now, 


g,= py (Fx aS Oss 


asa 
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whilst the spectral function 2% of A”, expressed by 2} = 2, — &, 
has no discontinuities. If A is an unbounded operator, one of the 
operators A’ or A” may be bounded. For instance, if all the points of 
discontinuity of &, lie in a finite interval, A’ is a bounded operator. 
Suppose that A has a purely continuous spectrum, and let C, denote, 
as in [147], the closed linear envelope of elements 2, 7. We say that 
A has a simple continuous spectrum if there exists an element 2 
such that C, coincides with H. Here, (254) and (256) of [147] will 
hold, with Hellinger integrals over an infinite interval. If y € D(A), 
(259) and (261) of [147] will also hold. We can regard the correspond- 
ing integrals as improper with an infinite interval of integration.On 
using the inequality [147] 

| Ay, (a) |? < de (A) 4 || Fy! 


2 
F) 


we can show as in [192] as regards sums (55), that the infinite sums 
of mutually orthogonal elements, corresponding to integral (261) of 

[147], yield a convergent series by virtue of the fact that y € D(A). 
In the general case of a continuous spectrum, it follows from the 
proof of Theorem 2 of [147] that C, reduces #, for any A, i.e. it also 
reduces A. The operator induced by A into C, has a simple continuous 
spectrum, and, as in [147], the operator with a purely continuous 
spectrum can be split into operators with simple continuous spectra 
in mutually orthogonal subspaces, the orthogonal sum of which gives 
the whole of H. In all the formulae we have a sum of Hellinger integrals 
instead of one such integral. A connection can be established, precisely 
as in [152], between C, and L$”. 

Let A be a self-conjugate operator and U a unitary operator. The 
operator A’ = UAU~— is defined on the lineal D(A’), obtained by 
applying U to the lineal D(A). Let us show that A’ is self-conjugate. 
Let 

(UAU~ x, y) = (x, y*) 


for all x of D(A’). We have to show that y € D(A’) and that y* = 
= UAU~-y. The above equation can be rewritten as (Az’, U-1 y) = 
= (Ux’, y*), where x’ = U-1z is any element of D(A), or as (Az’, 
U-'y) = (x’, U-1y*), whence it follows, since A is self-conjugate, 
that U-1y € D(A) and U-1y* = AU—y, ie. y € D(A’) and y* = 
= UAU-'y, which is what we set out to prove. 

Let 2, be the spectral function of operator A. Now 2; = U%,U-3 
has all the properties of a resolution of the identity, and || #ja || = 
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= ||%,0-—2 ||. If zx € D(A’), then U-!z € D(A), so that the series 


+0 


= A? A || F 2|/? 


is convergent, and, as in [192], the sum 


+00 

‘ > 4A, 8,2 
has a limit A’x = UAU~-'¢, whence it is clear that 7; = U@, U-} 
is the spectral function of A’. The test of [153] for the unitary equi- 
valence of operators remains in force. 

The concepts of differential solution and of a complete system of 
differential solutions are retained without change. Every differential 
solution x(A) (continuous in the sense of space H) has the form @, 2, 
where x € D(A); it is assumed here that 2({A) > 0 as A— —oo. 


197. Functions of a self-conjugate operator. If 2, is the spectral 
function of a self-conjugate operator A and f(A) is a bounded function 
in the interval —co <4< +0°, measurable with respect to all 
non-decreasing functions || #, 2 ||?, then 


+00 


(fA)z.y)= J fa) di®,2,y) (85) 


—o 


(eH; y€H) 


defines, precisely as in [155], a bounded operator f(A), defined through- 
out H, and having all the properties indicated in [155]. Notice that 
the values of f(A) on a set of measure zero with respect to all the 
||, z ||? have no effect on integral (85), i.e. they have no effect on 
f(A). Let us now generalize the concept of a function of an operator 
f(A) to real functions f(A) with finite values, that are measurable as 
before with respect to all || %,2||?, but are unbounded. Let fy(A) 
denote the cut-off function, i.e. the function defined by the equa- 
tions f(A) = f(A) if | f(A) | < N, fx(A) = W if f(A) > N, and fx(A) = 
= —N if f(4) < —N. We can form the bounded operator fx(A) for 
the bounded function fy(4). We now have, by (302) of [155], when 
y =a and f,(A) = f,( A): 


foo 
Il fv( 4) @ — f(A) @ ||? = e) | fn(A) — f(A) |? |] A, x |)? (86) 
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If f(A) belongs to LZ, with respect to || %, 2 ||*, the right-hand side 
tends to zero as M and N—--+0., since | fy(A)| < | f(A) | and 
fn(A) > f(A) almost everywhere with respect to || @,2||?, ie. the 
sequence {y)(A)z is mutually convergent, and a limiting element 
exists, which we write as {(A)z, i.e. fy(A)a => f(A)z as N>+0, if 


eo 
Sf PANE, 2 |? < + 2. (87) 


It is natural to write D[f(A)] for the set of elements z that satisfy 
this condition. Some relevant results will be mentioned, whilst omit- 
ting the proofs. 

The lineal D{f(A)] is everywhere dense in H, f(A) is a self-con- 
jugate operator, and 


(f(A yau)= $ 10) d(, x,y) (88) 
(x€D(f(A)], ye A). 


In the case of a complex function f(A) = /f,(A4) + f,{A)¢, we 
take f(A) =f,(A)-+/f,(A)?, and the lineal D[f(A)] is defined as 
before by condition (87), with /*(A) replaced by | f(A) |?. The fol- 
lowing proposition holds, just as in [156]: the necessary and sufficient 
condition for an operator to be a function of a self-conjugate operator 
A is that it be closed and commute with any bounded operator that 
commutes with A. 

Let us turn to the question of the commutation of general self- 
conjugate operators. On recalling the theorem of [156], the following 
definition is naturally arrived at: two self-conjugate operators A and 
B are said to commute if their spectral functions 7, and F,,, (bounded 
operators) commute for any 4 and w. By the theorem mentioned, 
this definition is equivalent to the ordinary one if A and B are bounded. 
If A is non-bounded and B is bounded, we have the definition of 
commutation of [191]. It may easily be seen to coincide with the one 
just given, if A and B are self-conjugate. For, by the theorem of 
[191], commutation in the previous sense is equivalent to the fact 
that B commutes with %, for any A, whilst this is equivalent to the 
fact [143] that, given any yu, F,, commutes with all &;, 1.e. we arrive 
at our new definition of commutation. 

By starting from the new definition of commutation of self-con- 
jugate operators, it can be shown that real functions of the same self- 
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conjugate operator A commute and that, if the self-conjugate opera- 
tors A,, A,,... commute in pairs, they are all functions of the same 
operator A (cf. [156)). 

Let us consider sums and products of unbounded operators. The 
operator (A + B)x = Az + Bz is defined for elements x that belong 
simultaneously to D(A) and D(B). The operator (4B) « = A(Br) is 
defined for xz such that 2 € D(B) and Bx € D(A). If a is any complex 
number, the operator (aA) 2 = a(Az) is defined on D(A). Let A 
and B be self-conjugate commuting operators, where B is bounded 
and defined on the whole of H. The operator A Bz is now defined on 
the lineal of x such that Bx € D(A). In accordance with the defini- 
tion of commutation, if x € D(A), then Bx € D(A), i.e. D(A) belongs 
to ’, though I’ may be a wider lineal than D(A). Let us show that 
AB is self-conjugate on l’. Let (ABz, y) = (x, y*) for x € I’, and all 
the more for x € D(A). We have to show that y € l’ and that y* = 
= ABy. On assuming that x € D(A), we can replace the equation 
in question by (BAz, y) = (x, y*) for x € D(A), or, since B is bounded 
and self-conjugate, (Az, By) = (x, y*) for x € D(A); hence, since A 
is self-conjugate, we have By € D(A) and y* = ABy, which is what 
we set out to prove. Notice that, if dand Bare unbounded commuting 
self-conjugate operators, the operator AB may not be self-conjugate, 
although its conjugate (4 B)* is always self-conjugate. 

Let us apply the definitions of sum and product to the power of 
an operator A. The lineal D(A?) consists of the x such that x € D(A) 
and Ax € D(A), i.e. D(A?) belongs to D(A) and may be narrower 
than D(A). Similarly, D(A’) consists of the x such that x € D(A?) 
and A*x€ D(A), so that D(A?) belongs to D(A’). A polynomial of 
the form a, A + a,4? +... -+ a,A4" is obviously defined on the lineal 
D(A"), It can be shown that this polynomial coincides with the func- 
tion of operator A defined above, if we take f(A) = a, + a,A4+...+ 
+ a, 4", and that the set of elements on which all the polynomials 
are defined is a lineal dense in H. 

If the self-conjugate operator A is positive, ie. the lower bound 
of the spectrum m, > 0, we can form as in [143] the positive self- 
conjugate operator A’, the square of which is equal to A : 


1 450 
A= f Vadé,. 


It may easily be seen that only one positive self-conjugate operator 
B exists, the square of which is A. For, let @/, be a resolution of 
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the identity of B. We must have 


+o + 0° 


B= J pd&, and A=B?=f{ pds 
0 0 


or 
++ 20 


A=B={ Mey. 


0 


The family of operators 21, depending on the parameter 4, is 
a resolution of the identity, and in view of the uniqueness of the spec- 
tral function J; = &,, it follows that B must be conjugate to A??. 


198. Small perturbations of the spectrum. Let us consider the variation 
in the spectrum of a self-conjugate operator when another self-conjugate 
operator is added to it. Notice first of all that Theorem 1 of [157] holds for un- 
bounded self-conjugate operators. Two further theorems must be proved. 

Let DZ be a subspace. Its dimensionality r is the number of elements of an 
orthonormal system complete in L. This number r may be finite or infinite. 
It is easily shown to be independent of the choice of complete orthonormal 
system in L. 

Lemma 1. Let L, and L, be two subspaces of r, and r, dimensions. If ry < 12, 
there 18 a non-zero element in L, which is orthogonal to all the elemente of Ly. 

Notice that, since H is separable, the number 7, is finite, whilst r, may be 
either finite or infinite. Let us use reductio ad absurdum. Suppose that there is 
no non-zero element in Z, which is orthogonal to L,. Let 2, 2, ..., a,, be a 
complete orthonormal system in L, (base in L,) and P a projector onto sub- 
space L,. We have for any element v € L,: 


(v, Pxy) = (Pv, xy) = (v, Lz) - 


The elements Px, of L, define some subspace L, of dimensions r, < r, (the 
sign is < if the Pz, are linearly dependent). Let us show that L, must be the 
same as L,. If this were not the case, there would exist an element y € Zn, 
different from zero and orthogonal to L,, and we should have (y, Px,) = 0, 
i.e. by the above equation, (y, z,) = 0 (k = 1, 2,...,7,), ie. y is orthogonal 
to L,. But there is no such element y, by hypothesis. We have shown that L, 
coincides with L,, i.e. r2 = 73. But rs < 7,, as we have seen, so that r, < 1, 
which contradicts the condition in the lemma. 

Lemma 2, Let A be a self-conjugate operator (unbounded or bounded), Ba 
bounded self-conjugate operator, €, the spectral function of A and &) the spectral 
function of A’= A+ B. Further, let A be some finite interval [a, b], and Ans 
the interval [a — || B|| — «,b+ || Bl| +], where € is any given positive 
number. The number of dimensions of subspace L, = (Ag , 64) x(z € H) is now 
not less than the number of dimensions of subspace L, = (4€,) a(e € A). 
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Notice first of all that we are using the notation of [141] (e.g. 4¢, = ¢, — 
— &,). We use reductio ad absurdum. Let LD, have at least as many dimensions 
as L,. By Lemma 1, there exists an element y ¢€ L,, different from zero and ortho- 
gonal to L,, We can assume that ||y|| = 1. On writing a = (a + b)/2 for 
simplicity, and noting that y € D,, we obtain 


| 4y—evlt= f @—araens.y) < (*y 


and 
Ary — ay I < || 4y — ay |] + || By] < 25444) BI. (89) 
On the other hand, since y | L,, we have 
+ a—||Bj|—e +00 


|A’y—ay |?= J (A—a)?d(Shy,y)= ; 
—- b+ |[Bi|+e 


whence we obtain, on again using the orthogonality of y to L,: 


+ 00 
ary —ay || >(“S* +BIl +e) [ acy, 


1.6. 


Ay —ay||> 2524 Bil +e. (90) 


The contradiction of inequalities (89) and (90) proves the lemma. 

THEOREM 1. Let the spectrum of the operator A inside the interval A consist 
of a finite number of eigenvalues, the sum of the multiplicities of which is equal to 
k, where k is a finite number, and let the distance of the remaining part of the spec- 
trum of A from A be greater than 2 || B ||. 

The spectrum of A’ in the interval Ag, 5 = [a — || B||, 6+ || Bll] now con- 
sists of eigenvalues, the sum of the multiplicities of which is equal to k. 

By Lemma 2, given any e>0, the number of dimensions of Z, is > k. If 
the > sign holds, we can say, on again using Lemma 2 and the equation A = 
= A’ — B, that the subspace 4), 5, &, # (x¢€H) has more than k dimensions. 
But by hypothesis, this cannot be true for all e sufficiently close to zero, i.e. Ly 
has & dimensions for such ¢, whence the theorem follows. 

Note. Application of the theorem with k=1 gives us the possibility 
of watching the variation of an isolated simple eigenvalue, in the case of small 
perturbations. 

THEOREM 2. If there is at least one point of condensation of the epectrum of 
an operator A inside an interval A, there must be at least one point of condensation 
of the spectrum of A’ inside the interval Ag 4. 

In this case k = co, and the theorem follows from Lemma 2. 

Note. If A, is a point of condensation of the spectrum of A, there is at 
least one point of condensation of the spectrum of A’ in the interval [A, — 


— || Bl], 40+ | B HII. 
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199, The operator of multiplication. Let us take space Z, on the 
interval (—°°, +0) and the operator of multiplication by the inde- 
pendent variable: 

Af(x) = xf(z). (91) 

The lineal D(A) consists of functions f(z) of LZ, such that zf(x) ¢ L, 
also, and in particular, every function differing from zero only on 
a finite interval belongs to D(A), whence it follows that the lineal 
D(A) is everywhere dense in H. Let us show that A is self-conjugate. 
We have to show that, if (Az, y) = (2, y*) for z € D(A), then y € D(A) 
and y* = Ay. The condition implies in the present case that 

boo 


—— asia —_————_ 
SJ aflx) ola) da = Sfx) p¥(a) dx (92) 
for all f(z) of D(A) and certain g(x) and ¢*(x) of Z,, and we have 
to show that g(x) € D(A) and g*(x) = xg(z). We apply (92) to f(x) 
of L,, where f(x) differs from zero on some finite interval (—a, +a). 
Notice that such a function belongs to D(A): 


+a 

S fa) [e*(@) — xp(z)] de = 0. 
It follows immediately [52] that, since f(x) is arbitrary, y*(xz) — 
— xp(x) is equivalent to zero in the interval (—a, +a), and hence 
in (—°°, +00), since a is arbitrary, ie. we can take y*(x) — zp(z) = 
= 0. But g*(x) € LZ,, so that 2zp(xr) € L, also and g*(x) = 29(z), 
which is what we set out to prove. The spectral function of operator 
(192) is defined as in [152], and is given by 


f(x) for x<A, 


Ba) =| 0 for «>A, 93) 


and the operator has a purely continuous spectrum, distributed over 
the interval —co < 4 < -+co, 

Operator (91) is obviously unbounded. Notice also that every 
function f(z) of D(A) is summable over (—9, +-cc), For, on putting 
x(x) = w(x), where w(x) € L,, we can write f(x) = (1/x) w(x); since 
I/e and (x) belong to LZ, in any interval (—°°c, —a) and (a, °%), 
where a > 0, f(x) must be summable in (—oo, +00), 

Precisely as in [152], we can consider an operator of multiplication 
by a function, which we shall assume to be real and bounded: 


A’ f(x) = w(x) f(x), 
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an unbounded operator being obtained in the case of unbounded 
w(x). Suppose for definiteness that w(x) is bounded throughout 
(—°o, -++-co), except for the neighbourhoods, however small, of a finite 
number of points. On taking any closed interval that does not contain 
these points, and arguing as above, (94) is seen to be a self-conjugate 
operator for /(z) € Z, such that w(x) f(x) € L, also. Its spectral func- 
tion is given, as in [152], by 


: f(z) for w(x) <A, 
6 if(z) -| 0 for w(x) <A. 


If say w(x) € L,, the lineal D(A) contains all bounded functions 
of LZ, and A’ is also a self-conjugate operator. Notice that the operator 
A’ can be regarded as a function w(A) of the operator A of multiplica- 
tion by the independent variable for the class of functions (zx) 
indicated in [197]. 

Let B denote the self-conjugate operator 


BO(x) = i £2). — g(a), (96) 


(95) 


defined in H = L,(—°o, +°°) on the set D(B) of functions (2), 
absolutely continuous in any finite interval and having a derivative 
in E,(—°co, +-°°) [188]. On using Fourier’s transformation, the lineals 
D(A) and D(B) may be formed. 


On writing 
+N +N 
Y y(at) = = i O(t) et dt and v(t) = = f ip(t) edt, (97) 


-—N -N 


where @(t) is any function of D(B), and integrating by parts, we get 
- [P(N) e®™N — O(— N)e®N], — (98) 


7 


XP y() — py(a) = = 


where the right-hand side tends uniformly with respect to x in the 
infinite interval to zero as N—> oo [188]. The functions py(xz) and 
Y(x) tend in the mean in the infinite interval to functions »(z) 
and WY (2) of Z, [178]. This will be all the more true in any finite interval 
[—a, +a], and furthermore, z¥ x(x) will tend in the mean to z¥(z) 
in such an interval. It follows from (98) that, given any positive e 
and sufficiently large N, we have 

+a 

J | a x(a) — pyle) Pde <e, 


-a 
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or, since we can pass to the limit under the norm sign, 


+a 
i) |a¥ (x) — p(x) Pda<e, 
—a2 

whence, since « and @ are arbitrary, 
+00 


J |2P(x) — p(x) Pde =0, 


ie. c¥(x) = y(z), or 


+00 00 
] 
—— { P(t) eas — —_ [3 ixt dt 
ar { (t) raz f iptthetae (99) 
which can be written as 
T*(D) = + T*(ip) =f, (100) 


and, since 7*(ip) € L,, we see that af(x) € L, as well as f(z) € L,, 
ie. f(x) € D(A). 

Let us now show that, if f(z) is any function of D(A), T(f) € D(B). 
Since f(z) and 2f(x) belong to Z,, we can form 7'(f) and T(zf). On taking 
(100) into account, we can introduce the notation 


+00 
H(z) = Ti) === f Meat, 


—oo 


a (101) 
ip(x) = T(xf) = res f edt. | 
We have [178] = 
N +00 
i | g(t) det == fo tf(t) — (1 — e#N4) de. 


If we make use of the first of formulae (101), from which it follows, 
since f(t) is summable in (—°c, +c), that 


hoo 
00) =F f fiat 


we can rewrite the above formula ag 


[ria (0) — = BW), 
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whence it follows that (xz) € D(B) and ig(x) = BO(z). We saw 
above that 7* transforms any element ® of D(B) into an element 
f of D(A), and we have just shown that 7' transforms any f of D(A) 
into ® of D(B). A one-to-one correspondence is thus established 
between D(A) and D(B), where 7 transforms D(A) into D(B) and 
a passage from f(x) to zf(x) before the transformation corresponds 
to a passage from ®(zx) to BO(x) = ip(x) after transformation, i.e. 


B= TAT*, (102) 


which leads to the following theorem. 

THEOREM. The self-conjugate operators A and B are unitary equi- 
valents and (102) holds, where T is the Fourier transformation. 

We have &; = T&,T* for the spectral function #{ of the operator 
B, where @, is given by (93), ie. 


A boo 
B! f(x) =a f | f (t) eat] e- 8 dy, (103) 
or 
ee 
85 fla) = = — | P* if) e-™ dy. (104) 
We can write D(x) as i 
P(x) = fF pltdt= — F pitt, (105) 


where it cannot be asserted that g(t) is absolutely integrable in the 
infinite interval, and the integrals written have to be understood as 
the limits of integrals over a finite interval as this is extended. The 
self-conjugate operator B-1, which is obviously defined by B-! 9 = 
== —7@, is not given in the whole of L,; it is given only for g(x) such 
that @(x), defined by (105), also belongs to Z,. This is due to the 
fact that A’ = 0 belongs to the continuous spectrum of B. 
If we consider operator (94), the operator 7'A’ T* will be a function 
w(B) of the operator B, so that 
anne +20 
o(B) f(a) = 55 [ oly)| f ferdtle™ dy. (108) 


—oo —oo 


If w(x) is an unbounded function, the lineal on which this operator 
is defined is obtained from D(A’) with the aid of the operator T'. 
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Remember that the integrals written, with infinite limits, have to 
be understood in the sense of a convergence in the mean. Like A, B 
has a simple continuous spectrum. The function w(x) appearing in 
(106) must evidently satisfy the conditions that we formulated for 
{(A) in [197]. 


200. Integral operators. We take the integral operator in the 
interval (a, b] with kernel K(x, y), satisfying 


Ky, %) = K(x, y) (107) 
and such that 


b 
K'(4)= i) \K(x, y)? dy < + co (K(x) > 0) (108) 
for almost all x of [a, 6]. The corresponding operator 
b 
p(x) = Kf(x) = { K(x, y) fly) dy (109) 


is defined on the lineal D(K) of f(z) of LZ, in [a,b] such that g(a), 
defined by (109), also belongs to L,. Let us also consider, as in [173], 
the lineal / of functions f(x) of Z, such that 


6 
{ K(x) |fla)| dz < + 00 (110) 


We have seen that the lineal / is everywhere dense in L, [173]. 
Let us show that, if f(x) € J, it belongs all the more to D(K). 
It will follow from this that D(K) is everywhere dense in L,. Let 
f(z) € l. We can write 
b aa Ea 
lp(z)P = f K(x, y) fly) dy- { K(x, t) flt) dt, 


a a 


so that 
b . bb b 
f lpayPda < ff (| K(x,y) || Kw) || fy) ||f)|dydtde, (111) 


and it is sufficient to verify that the integral on the right has a finite 
value no matter what the order of integration. By Buniakowski’s 
inequality: 


b b b 1 
f |A(x, y)| |K(x, t)| da < [f |K(x, y)|? dz- f [A(a, #)? da]? = 


a u 


= K(y) K(t), 
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and the right-hand side of (111) does not exceed the product 
b b 
f Ky) |fyldy- § K@) |fe) | de, 


which has a finite value, since f(z) € 1. Let Af(x) denote the operator 
defined by (109) on the lineal J. It may easily be shown that this is 
a symmetric operator, i.e. 


bb ——— bb — 
‘ [J K(x, y) f(y) dy] w(x) de = J [J K(x, y) (a) dx] fly)dy, (112) 


if f(z) and w(x) belong to 1, where, by (107), 
b Gav, st? 
§ K(x,y) w(x) da = J K(y, x) w(x) dx 


a 


obviously belongs to ZL, as a function of y. To prove (112), it is suf- 
ficient to verify that 


db 
S$ | K(a,y) || Hy) [| o(a) |dy de (113) 
aa 

is finite whatever the order of integration. If we integrate first with 


respect to x and use Buniakowski’s inequality, expression (113) is 
seen to be not greater than 


b 
ol J Ke [fw Lay, 
and this quantity is finite, since f(y) € 0. 
The symmetric operator A, defined by (109) on the lineal /, is by 
no means always self-conjugate; but it has a conjugate A*. Let us 
show that A* is the same as the operator K, which is defined by the 


same formula (109) on the lineal D(K) of f(x) € LZ, such that p(x) € Ly 
also. Suppose that, for all f(z) € J, 


b ob _ b Pe See 
S LS Kz y) Hy) dy] o(@) de = J f(x) w*(x) da, (114) 
where w(x) and w*(x) € Z,. We have to show that 
b 
w*(x) = § K(x, y) oly) dy, (115) 


whence it will follow that w(x) € D(K) also. When proving that 
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integral (113) is finite, we only made use of the fact that functions 
{(z) belong to 1. We can therefore change the order in the integral on 
the left-hand side of (114), and this formula can be rewritten as 


b b 
J f(x) [w*(x) — J K(x, y) oly) dy] da = 0. (116) 


The proof of the theorem of [173] leads us at once to the fact that 
the difference in square brackets must vanish, and we get (115). 
Conversely, if w(z) € D(K) and w*(x) is given by (115), it follows 
at once from the above working that (114) holds. The above discussion 
yields the theorem: 

THEOREM. Let K(y, 7) = K (x,y) and condition (108) be fulfilled. 
Let 1 be a lineal of functions f(x) € L, in the interval [a, b], satisfying 
condition (110), and D(K) the lineal of functions f(x) € L, such that 
g(x), defined by (109), belongs to L,. The lineal l is now everywhere dense 
in L, and belongs to the lineal D(K). If, in addition, A is the operator 
defined by (109) on the lineal l, and K is the operator defined by the same 
formula on D(K), A is self-conjugate and A* = K. 

A necessary condition for K to be self-conjugate is that it be sym- 
metric, which reduces, by (107), to the equation 


FL) x2) x) da] w(y) jay = FL) Key o(y) dy] f(x) dx, (117) 


which must be satisfied for all f(z) and w(x) of D(K). Let us show that 
this condition is also sufficient for K to be self-conjugate. In fact, 


let 
b 


b 
JUS Ke, y) fy) dy] ox) Hae = | fe) de, 

for all f(z) of D(K), where w(x) and w*(x) belong to L,. We have to 
show that w(x) € D(K) and (115) holds, where, in view of the definition 
of D(K), it is sufficient to prove (115). On subtracting (117) term by 
term from the last equation and noting (107), we get (116), whence 
(115) follows, in view of the arbitrary choice of f(z) of D(K), which 
contains J. Thus the necessary and sufficient condition for an opera- 
tor K to be self-conjugate is that (117) be satisfied for any f(x) and 
a(x) of D(K). 


Let us mention some simple examples of self-conjugate operators in the case 
of a kernel dependent on a difference in the infinite interval (— co, + oo). 
Let g(t) be a real even function of LE, in this interval and f(z) any function of 
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L,. Let us write G(t) = T*(g) and F(t) = 7*(f). The function G(t) is real, since 
g(t) is a real even function. We can write [178]: 
+00 
1 


+o 
v(2) = = i g(e— y) fy) dy = f Gt) F(t) eat, (118) 


—co 


where, since G(t) ¢ ZL, and F(t) ¢ L,, the product G(t) F(¢) is summable in 
(— «0, + 0), Let J’ be the lineal of functions F(t) of LZ, such that G(t) F(t) ¢ Ly. 
The right-hand side of (118) is T(@F) on the lineal 1’, so that y(a#) must belong 
to L,. On turning our attention to the middle part of (118), we can thus say 
that, on the lineal Li, which is got from 1’ with the aid of transformation 7’, 
the integral operator K with kernel g(x — y) is the unitary equivalent to the 
operator of multiplication by the functions G(t) of Z,, and is therefore a self- 
conjugate operator. Remember that D(K) denotes the lineal of f(x) of LZ, such 
that v(x), defined by (118), also belongs to L,. It follows from the above argu- 
ments only that 1; belongs to D(K). It can be shown that Li coincides with 
(kK). This assertion is obviously equivalent to the following: if y(x) € L, in 
(118), then G(t) F(t) ¢ Z,. The coincidence with D(K) follows immediately 
from the fact that it is impossible to extend a self-conjugate operator so as 
to again obtain a self-conjugate operator. We proved this in [187]. Hence, 
if g(t) is a real even function of L,, the integral operator with kernel g(x — y) 
is a self-conjugate operator in D(K). 


201. The extension of a closed symmetric operator. We shall 
assume in future that A is a closed symmetric operator. The next 
two theorems are fundamental for what follows. 

THEOREM 1. The lineal D(A) of elements x is mapped one-to-one 
in accordance with the formulae 


y=(A+iE)a, (119) 
z= (A —itE)zq, (120) 
onto subspaces L;(A) and L_j,(A), where, if y and z are elements of these 


subspaces, corresponding to the same x of D(A), the distributive operator 
U, transforming y into z: 


z= Uy, (121) 


maps L;(A) one-to-one onto L_,(A) whilst preserving the norm and the 
scalar product: 


||| = lly; (Cy Oye) = (Yas Y2)- | (122) 
We have 


(A + 4B) al? = (A + 42) 2, (A + 6B) 2) = 
= (Au, Ax) + i(x, Ax) — i(Ax, x) + (%, 2), 
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whence it follows, since A is symmetric, that 
||(A + 8) 2|? = |[Aal? + |] 2]? (123) 
(v€D(A)). 


Let us show that, given distinct 2, (119) yields distinct y. If distinct 
az, and x, € D(A) were to give the same y, their difference x = 2, — 2, 
would give y = 0, and we have to show that (A + iH) x = 0 implies 
z= 0. But this is an immediate consequence of (123). Thus (119) 
maps D(A) one-to-one onto a lineal Z;(A). Let us show that it is 
closed, i.e. that it is a subspace. On using (123), we get ||(A + 
+ te) x || < || z||, whence it follows that (A + 7#)-} is bounded. 
But we proved in [184] that, ifan operator B is closed, and the bounded 
operator B~-! exists on R(B), R(B) must be a subspace. In other words, 
L(A) must be a subspace. Similarly, it follows from the equation 


||(A — 4H) a]? = || Aa? + [[a'). (123) 


analogous to (123), that (120) maps D(A) one-to-one onto a sub- 
space D_, A). On taking y and 2, corresponding to the same 2, 
we get the fully defined distributive transformation (122), mapp- 
ing L(A) one-to-one into L_({A), where (123) and (123,) may be 
written as || y |? = || Ax ||? + || @ |? and || 2 |[? = [| Av]? + || @ |), 
ie. || 2 || = |] y || or || Oy || = || y ||. The proof of the second of for- 
mulae (122) is thus precisely the same as for unitary operators [137], 
and the theorem is therefore proved. 

Notice that, if A is self-conjugate, in view of the fact that +7 
are regular points of A, L,(A) and L_,A) coincide with H [189] 
and U is an unitary transformation. If both, or at any rate one, of 
the subspaces does not coincide with H, U is generally called an iso- 
metric operator, i.e. a distributive operator U, defined on a subspace 
L’ and mapping it one-to-one onto another subspace L” whilst 
preserving the norm (and hence the scalar product) is known as an 
isometric operator. The inverse U~-!, mapping L” onto L’, is obvi- 
ously also an isometric operator. If Z’ and L” coincide with H, U is 
a unitary operator defined throughout H. The formulae y = Az + iz 
and Uy = Ax — ix map D(A) one-to-one onto £,(A) and L_,(A), 
and lead to the formulae 

 — 


(y — Uy); 


] 
os 
1 
At=—> 


(y+ Uy), 
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the first of which maps ZL;(A) one-to-one onto D(A). If we replace 

y by 2iy, which leads to the same lineal L,(A) of elements y, we ob- 
tain the simpler formulae 

c=y— Uy; (124) 

Ax = ify + Uy), (125) 


the first of which maps Z,(A) onto D(A) one-to-one as_ before, 
whilst the second gives the corresponding element Az. Notice that the 
lineal D(A), defined by (124), is dense in H. The isometric operator 
U is known as a Helly transformation of the closed symmetric opera- 
tor A. Let us now prove the converse, in the accepted sense, of the 
previous theorem. 

THrorEM 2. If U is an isometric operator, mapping the subspace L’ 
onto the subspace L", and formula (124) for y belonging to L’ defines 
a lineal | dense in H, then (125) defines a closed symmetric operator 
A on l, where U is the Helly transformation of A, whilst L’ and L” 
coincide with L,(A) and L_,(A). 

We must show first of all that, given distinct y of L’, (124) yields 
distinct 2, i.e. as above, we must show that y, — Uy, = 0 implies 
Yo = 0. We form the scalar product (yo, x). If we can show that it 
vanishes for any x of J, we can assert that y, = 0, since this lineal is 
dense in H. Thus 


(Yo, Z) = (Yo. y — Uy) = (Yo, ¥) — (Yo, Uy), 


or, since U is isometric: 
(Yo, X) = (OYo, Uy) — (Yo, Uy) = (UY — yo, Vy) = (0, Uy) = 9, 


which we wanted to prove. Given any 2 € J, we obtain in accordance 
with (124) a definite y € L’, and in accordance with (125), a definite 
Az. A distributive operator A is thus obtained. Let x’ and x” be two 
elements of 1, and y’, y” the corresponding elements of L’. Using 
the fact that U is isometric, we obtain: 


(Aa’, 2") = (i{y’ + Uy’), y”’ — Uy") = 
= iy’, y") + (0y',y") — uly’, Uy") — (Uy’, Uy") = 
= (Uy', y") —ily’, Uy’). 
We get the same result on expanding (2’, Az”) = (y’ — Uy’, 
i(y” + Uy’)), ie. (Ax’, x") = (x’, Az”), so that A is symmetric. Let 
us show that A is closed. Let 2, € l be such that 


2, =>« and Ax, => w. (126) 
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We have to show that x € J/and w = Az. Let y, denote the elements 
of L’ corresponding to 2p, i.e. 


t= Yn — Uy, Ax, = ly, + Uy,); (127) 


it follows from these equations that y, = (1/27) (Ag, + txp,) => 
=> (1/27) (w+ tz). On writing this limit as y for brevity, we can 
say that y € L’, since L’ is a subspace, and that Uy, > Uy, since 
U is an isometric operator, and hence || U(y — yn) || = || y¥ — Yn ||. 
On passing to the limit in (127), we get e = y — Uy and w = i(y + 
+ Uy), where y € L’, ie. x € L and w= Az, which we wanted to 
prove. Finally, replacing x by 2¢% in (124) and (125) gives 


y=(A+iE)x and Uy=(A—iE)z 
(x€l), 


whence it follows at once that U is a Helly transformation of A, and 
that L’ and L” are L;(A) and L_,(A); the theorem is proved. 

The above theorems throw light on the possible extensions of a 
closed symmetric operator A. Let B be a closed symmetric extension 
of A (not coinciding with A). The right-hand sides of the equa- 
tions 

y=(B+ik)x; 2=(B-— iE) 

are defined on the lineal D(B), wider than D(A), whilst they give the 
same result for elements x belonging to D(A) as the right-hand sides 
of (119) and (120). The subspaces £,(B) and £_,(B) are therefore 
strictly wider than subspaces L,(A) and L_,(A), and if we write V 
for the Helly transformation of the operator B, we can say that 
V transforms L,(B) into L_,(B), and coincides with U on L,(A), i.e. 
the isometric operator V is an extension of the isometric operator U. 
On using Theorem 2, we can say that, conversely, any extension of 
an isometric operator U, leading to another isometric operator V, 
yields, in accordance with 


r=y—Vy; Be= uy + Vy) (128) 
(yéD(V)), 


a closed symmetric operator B which is an extension of A. By what 
has been proved, it is only in this way that extensions of A can be 
obtained that are closed symmetric operators. 

If A is a self-conjugate operator, L,(A) and Z_,(A) are the whole 
of H, and extension is impossible. 
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202. Deficiency indices. Let M (A) and M_,(A) denote the subspaces 
complementary to L;(A) and L_,(A), and let p and g be the numbers 
of dimensions of the former subspaces. If say £,(A) is the whole 
of H, M,(A) is absent, and we take p = 0. 

If M,(A) is finite-dimensional, p is the number of its dimensions, 
whilst p= oo if M;,(A) is infinite-dimensional. The number pair 
(p, g) defines the so-called deficiency indices of the operator A. Let 
us prove a number of simple theorems on deficiency indices. 

THEOREM 1. The necessary and sufficient condition for a symmetric 
closed operator A to be self-conjugate is that both its deficiency indices 
be zero. 

If A is self-conjugate, (119) and (120) transform D(A) into H, as 
we know from [201], so that p = gq = 0. Suppose conversely that 
p = q = 0. Now, L,(A) and L_,( A) coincide with H, and U is a unitary 
operator (defined in the whole of H, like U-1). Let (Az, v) = (a, v*) 
for all x of D(A). We have to show that v € D(A) and v* = Av. By 
(124) and (125), the previous equation can be rewritten as 


i(y, v) + (Uy, v) = (y, o*) — (Uy, o*), 
whence, since U is unitary, 
(y,v* — U-1v* + iv + iU-1e) =0, 


and we have, since y is arbitrary: v* + iv = U-l(v* — iv). On writing 
vt + iv = 2iy’, we have v* — iv = 2i1U y’, whence v = y’ — Uy’ and 
v® = i(y’ + Uy’), ie. v € D(A) and v* = Av; the theorem is proved. 
The sufficiency is also a consequence of Theorem 2 of [187]. 

Before proving the next theorems, the structure of isometric opera- 
tors must be explained. It can be shown, precisely as for unitary 
operators [137], that an isometric transformation U of the subspace 
L’ into the subspace L” amounts to a transformation of the complete 
orthonormal system 2,, 7, ... in L’ into the same system 4;, Y, . 
in L’, so that Ux, = y, (H is assumed separable) and 


U Zag 2%, = 2 Ie 


Here, obviously, either both subspaces are infinite-dimensional or 
both have the same number of dimensions. If this condition is fulfilled, 
in view of the arbitrary choice of base vectors, we can form an infinite 
set of isometric mappings of L’ onto L”. If we have an isometric 
operator U, mapping £,(A) onto Z_,A), we can only widen it by 
the addition of the same number of new base vectors from M,(A) 
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and M_,(A), and by establishing a correspondence between them. 
The following general theorem is an immediate consenquence of these 
remarks. 

THEOREM 2. The necessary and sufficient condition for a closed 
symmetric operator A to be extensible whilst preserving its symmetry 
is that both the deficiency indices of A be non-zero. If this condition is 
fulfilled, an infinite set of extensions exists. The necessary and sufficient 
condition for A to be extensible as far as a self-conjugate operator is that 
the deficiency indices of A be the same (non-zero), and if this is the 
case, an infinite set of such extensions exists. 

A general scheme can be described for extending an operator 
A. Having extracted from M;(A) and M_;(A) any given subspaces 
N; and N_,; with the same number of dimensions, we form some 
isometric operator V, mapping N; onto N_;. We define for the 
extended operator B the subspace L,(B) as the orthogonal sum 
L{A) @ N; = LB), so that every element y of L,(B) is uniquely 
expressible as y = y’+y", where y’ €L;(A) and y”€N;. The extend- 
ed isometric operator V is defined by Vy = Uy’ + Vy’, the right- 
hand side of which is a decomposition of Vy, belonging to L_{A) ® 
@ N_;, onto the orthogonal subspaces L.A) and N_;. By (124), 
the lineal D(B) is defined by x= y’+ y” — Uy’ — Vy" = (y’ — 
— Uy’) + (y” — Vy"), where (y’ — Uy’) is any element of D(A) and 
y” is any element of V;. We can write this fact as 


Lg—X,z+ ty, — VEn,- (129) 
Similarly, (125) gives Br = i(y’ + Uy’) + ity” + iVy’, ie. 
Bugz = Ax, + tty, + iVay,. (130) 


If N; and N_, coincide with M,(A), the last formulae define a self- 
conjugate extension of A. It is easily shown that the expression 
for xg as the sum (129) is unique. In other words, we have to show 
that, if sum (129) is equal to the zero element, all the terms are 
equal to the zero element. In fact, if zy; = 0, then Brg = 0, and 
(129) and (130) give 24+ 2%y,— Vay, =0 and Az, + itn, + 
iVzy, — 0. On multiplying the first equation by 7 and adding to 
the second, we get (A + t#) x, 4+ 2try, = 0; the first term in this 
sum belongs to L;(A), whilst the second is orthogonal to L;(A), whence 
it follows that they are both zero, i.e. Xn, = 0, so that Van, =) 
and x, = 0, which is what we set out to prove. 
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If one of the deficiency indices is zero, whilst the other is non-zero, 
A has no closed symmetric extensions; such an operator is described 
as maximal. A self-conjugate operator, i.e. an operator for which 
both deficiency indices vanish, is alternatively described as hyper- 
maximal. Suppose that A has deficiency indices (1, 1), ie. that the 
subspaces M,(A) and M_,(A) are one-dimensional, and let v, and wy 
be any given elements of them, where || 7, ||=|| w || ¥ 0, so that all 
their elements are expressible as v = av), w = awW,, where a is any 
complex number. The formula V(av,) =e '° awy, where @ is any 
given real number from the interval 0 < @ < 2a, gives an iso- 
metric mapping of M;(A) into M_,(A), and, on adding this trans- 
formation to the U that maps Z;(A) onto Z_,(A), we get a unitary 
operator V; (128) defines a self-conjugate operator B, which depends 
on the choice of the above-mentioned 6. If the deficiency indices of A 
are (2, 2), we obtain on choosing any two mutually orthogonal nor- 
malized elements 2, v, of M;(A) and similarly w,, w, of M_,( A), 


Vv, = Wz (k = 1,2). 


If we fix v, and choose wu, by all possible methods, we get all the dif- 
ferent V. On extending the isometric transformation U as far as the 
unitary V, a self-conjugate operator A is again obtained in accordance 
with (128). 

Let us prove a further theorem, giving a new characteristic of 
subspaces M;(A) and M_,(A). 

THEOREM 3. M;(A) ts the subspace of eigenelements of the operator 
A* corresponding to the eigenvalue 2 = t, i.e. the subspace of solutions 
of the equation A* x = ix, whilst M_,(A) is the subspace of solutions 
Of AY gS ai, 

Notice that, since A* is a closed operator, the lineal of its eigen- 
elements corresponding to a given eigenvalue is always closed, i.e. 
is a subspace. The elements v of the subspace M;,(A) are characterized 
by the fact that they are orthogonal to (A + 7H) x for any x of D(A), 
ie. they are characterized by the equation ((A4-+ ¢£) x, v) = 0, 
which may be rewritten as (Az, v) = (az, tv), where x is any element 
of D(A). In view of the definition of A*, the last equation is equivalent 
to the fact that v € D(A*) and A* v = iv, and the assertion of the 
theorem regarding M,(A) is proved. The assertion regarding M_,(A) 
is proved in the same way. It follows from this theorem that the 
existence of non-zero deficiency indices is bound up with the fact that 
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the operator A* is no longer symmetric on D(A*) and has the eigen- 
value 7 or (—7), or both. 

We could have taken any non-real number / of the upper half- 
plane and its conjugate 7 instead of +7. In this case the formulae 


y=(A+4E)a; z=(A44+7E)2 (x€D(A)) 


perform a one-to-one mapping of D(A) onto certain subspaces L,(A)} 
and (A), and we have the isometric mapping z = Uy of the first 
onto the second. 

The complementary subspaces M,(A) and Mj(A) are subspaces of 
solutions of the equations A* x = —dz and A* x = —Az. Let (p,, q) 
denote the dimensionalities of these subspaces. These are the deficiency 
indices of A. It will be shown below that they do not depend on the 
choice of 4 from the upper half-plane. Formulae (129) and (130) 
take the form 


Tp —=,t+ty,—Vity, Brg= ALA — dix: + AVay,. 
The first gives D(B), which can obviously be written as 
D(B) = D(A) + (E—V)N,, (129,) 


where NV, is a subspace of M,(A) and V is the isometric operator 
transforming N, into the subspace Nj, lying in J4;(A). 

Let L, (k = 1, 2, ..., 2) be lineals. The direct sum of the ZL, is 
defined as the set L of elements that are expressible as 7 = x, + x, + 
+... + %,, where x, € Ly, if this expression for x is unique (L is 
a lineal). Formula (129,) provides an example of a direct sum. The 
direct sum is often written in the form ZL = L, + L, (a point over 
the -+ sign). 


203. The conjugate operator. We established in Theorem 3 the 
connection between the subspaces introduced and the conjugate 
operator. We shall end by explaining the composition of D(A*) and 
the connection between A* and A. Let x,, x, and x_, denote any 


elements of D(A), M;(A) and M_,A). We have the theorem: 
THEOREM. The formula 


v=2t,+2,+2_; (131) 
gives the whole of the lineal D(A*) and 
A*v= Ax, + iz; — it_,;. (132) 


Every element of D(A*) is uniquely expressible by (181). 
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It follows at once from the fact that A* is an extension of A and 
Theorem 8 of [202] that the elements v defined by (131) belong to 
D(A*) and that (132) holds. Let us show that v is uniquely expressible 


by (131), ie. that 
La+%4,+2_;=0 (133) 


implies that all three terms are zero elements. 

Application of the operator A* to (133) gives Az, + tx; — ix_; = 0, 
and, on multiplying (133) by ¢ and adding to this last equation, we 
get (A + i#) av, + 2i2; = 0. The terms on the left-hand side are 
mutually orthogonal, so that both vanish, i.e. z; = 0. Similarly, on 
multiplying (133) by (—7), we get w; = 0 and, in view of (133), 
ta = 0. It remains to show that every element v of D(A*) is expres- 
sible by (131). Such an element is characterized by the fact that 
(Ax, v) = (x, v*) for all « of D(A), or, in view of (124) and (125), 
(iy + Uy), v) = (y — Uy, v*), whence it follows that 


(Uy, o® — iv) = (y, v® + iv) (134) 

(ye L{A)). 
On projecting onto mutually orthogonal subspaces, we can write 
vt —w=2, ,+% 3 vw +iv=2,,+2%;, (135) 


where ,, € L(A) and a,_, € L.A). On substituting in (134) and 
using the fact that v_;1 Uy and a 1 y, we get (Uy,a,_) = 
== (y, 2,,) or, on putting z,, = y’ and z,_, = Uy", where y’ and y” 
belong to L,(A), we can write (Uy, Uy") = (y, y’); further, since U 
is isometric, (y, y’ — y’) = 0 for any y of L;{A), and in particular, 
for y= y" — y’, which leads to the equation y” = y’, le. 2,,= y’ 
and x, ,= Uy’. On substituting in (135) and subtracting these 
equations term by term, we get v = (1/22) (y’ — Uy’) + (1/22) a; — 
(1/27) z_;, whence it follows that v is expressible by (131), since 
(1/22) x; € M,(A) and (—1/27) x, € M_;,(A). 
A corollary of the theorem must be mentioned. It follows directly 
from the theorem that 
w= (A*— i)» (136) 


transforms the lineal D(A*) of elements v into H. For, by (131) and 
(132), we get w= (A — t#) x, — 2ix_;, the first term being any 
element of L_;(A) and the second any element of the complementary 
subspace M_,(A). Thus, if we interpret (136) as an equation in », 
it has solutions for any w € H. The homogeneous equation (A* — 
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— if) v = 0 now has the subspace M;(A) as a solution. If it is non- 
empty, i.e. the first deficiency index p¥ 0, (136) has an infinite set of 
solutions v = v, + 2; for any w, where v, is any particular solution 
of (136), and 2; any element of M;(A). A particular solution v, has 
the form (131), and, in view of the possibility of adding to the solu- 
tion any element of 1f;(A), we can assume that v, does not contain 
x;, i.e. the solution of (136) can be written as v = (x, + z_;) + %, 
where x, and 2x_, are definite elements and 2; is an arbitrary element. 
If p = 0, 2; is missing, and a definite solution is obtained. Similarly, 
the equation 
(A* + iF) ov’ =w’ 


with an arbitrary w’ of H, has the general solution v’ = (x4, + 
+ 2) + 2.;, where v4 and z are definite elements, whilst x’; is 
arbitrary. If the second deficiency index g = 0, x_; is absent. 

Notice that the formulae hold: v = 24 + 2% + a4; A* v = Az, — 
— Az, — day (YA > 0), analogous to (131) and (132). 


204, Maximal operators. A simple method can be given for forming 
maximal operators. We choose a complete orthonormal system in H: 


Ly, Ug. (137) 


and define an isometric transformation U by the formula Ux, = 
= 4, (kK = 1,2, ...), ie. given any element y of HZ: 


Y= ee ym (> |a,|? < + 0) 


we have 
Uy = 2% Th+1- 


Following the notation of [201], we can say that L’ is H, whilst 
L” is formed by all the base vectors (137) except for z,, and formulae 
(124) and (125), in which y is any element of H, lead us to a closed 
symmetric operator A with deficiency indices (0, 1). We only have to 
verify that the lineal D(A) formed by the elements y — Uy is dense 
in H. Obviously, all we need is to prove the existence of elements 
x of D(A) such that the norm || x, — x || is as small as desired, where 
x, is any given base vector. We form the element 


m—1 
nae m—s 
Y= 


™ 


Lets» 
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where m is a positive integer. We have 


Sin 8 ce 8 
z=y—Uy= & a Shs — ae a Chtet = 
rf m 
Se A tuts) 
$= 


whence it follows, by Pythagoras’ theorem and the fact that the 
Zyp+3 are normed, that 


l1~ 1 
te — 21? = Se ae ll tel = <P> 


and, on letting m increase indefinitely, we get values as small as desired 
for || x, — x ||, which is what we set out to prove. A maximal operator 
of this type is called an elementary symmetric operator. If we put 
B = —A, D(B) becomes equal to D(A), and we get instead of (119) 
and (120): y=(—B+ iE) «v,z=(—B—ik)z, whence it is clear, 
on replacing z by (—2), that (B+ iH) x maps the lineal D(B) 
onto the subspace L.A) and (B—iE) x maps D(B) onto 
D(A), ie. when A is replaced by (—A), LZ; and L_; change places, 
and consequently, if A is an operator of the type described, with 
deficiency indices (0, 1), (—A) has deficiency indices (1, 0). 

Let U, be a unitary operator transforming the base vectors (137) 
into the base vectors 2. Application of the above method to the 
vj, gives us an isometric operator U’ xj, = 2%4, and an elementary 
symmetric operator A’, where obviously, U’ = U,UU 9 1 A’= 
= U, AU", and D(A’) is obtained from D(A) with the aid of the 
operator U,. It can be shown that, if A is any closed symmetric 
operator with deficiency indices (0, g), where g > 0 and is finite, then 
H can be written as an orthogonal sum of subspaces: H = L,® 
OL,@L,6 ... © Ly, reducing the operator A and such that each 
L, with k > 1 is infinite-dimensional, whilst the operator A,, induced 
by A into Z,, is an elementary symmetric operator; the subspace 
L,, which may in fact be absent, may be either infinite- or finite- 
dimensional, and the operator A, induced into it is self-conjugate. 
A similar result holds when ¢g = ©. 

In the case of deficiency indices (p, 0), where p > 0, the A, are 
the elementary symmetric operators with reversed sign. 
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205. Extension of symmetric semi-bounded operators. Let A be a 
symmetric semi-bounded operator with lower bound ma: 


(Az, x) > m,(2, 2). (138) 


We shall assume for the present that m, > 0, ie. that A is a posi- 
tive definite operator. We shall try to widen it in such a way that it 
remains symmetric and its range of values becomes the whole 
of H. As we know [187], such an extension leads to a self-conjugate 


operator. 
We associate the operator A with a real-valued quadratic functional 
te D(A), 


where y is any fixed element of H, and we consider the question of 
its minimum. 
THEOREM 1. If the equation 
Az = y (140) 


has a@ solution x € D(A), then J,(x) < Jy(z), where z is any element 
of D(A) and the = sign holds only with z = x. Conversely, if, for a given 
a € D(A), Jy(x) < Jy(z), where z is any element of D(A), then x satisfies 
equation (140). 

Let x € D(A) and satisfy (140). Given any z € D(A), we have, 
since A is symmetric, 


J y(z) = (Az, z) — (y, 2) — (2, y) = (Az, z) — (Az, z) — (z, Ax) = 
= (A(z — 2),2 —z) — (Az, 2) = (A(z — 2), 2 — 2) + J, (x) 
and the first part of the theorem follows from 
(A(x —2),4—2z)>m,||e —2|/? 
and m, > 0. 
Conversely, if 2 € D(A) and J,(z) < J,(z) for all 2 € D(A), the 


quadratic function J,(x + tz) of the rea] parameter ¢ has a minimum 
at ¢ = 0 for any fixed z € D(A). Hence it follows that 


(Az, z) + (Az, 2) c= (y, z) => (2, y) =0 ’ 
or 

(Ax, z) + (z, Ax) — (y,z) — (z,y) =0, 
ie. Q(Ax — y,z) = 0, where @ is the sign of the real part. On 
replacing 2 by iz, we get .7(Arv — y,z) = 0, ie. (Ax — y, 2) = 0, 
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and, since D(A) is dense in H, we get Ax — y = 0, which proves the 
second part. 

This theorem shows that, given any y € H, the functional J,(z) 
attains a least value for some xz € D(A). As regards this, we shall 
subsequently widen the domain of definition of our functional. In 
accordance with (139), it is defined on D(A). 

We introduce a new scalar product into D(A) by putting 


[,y], = (Aa, y) or simply [x,y] = (Az,y), (141) 
so that the new norm is 
Il = |, = [2,7] = (Az, 2). (142) 
We have by (138): 
ell < lela. (143) 


It is easily verified that, given the previous definition of multiplica- 
tion by a number and of addition, and with the scalar product (141), 
the elements of the lineal D(A) satisfy all the Hilbert space axioms, 
excepting possibly the axiom of completeness. If this axiom is not 
satisfied, we can fill out D(A) with new ideal elements so as to obtain 
a complete Hilbert space, which we shall write as H,. We must inves- 
tigate this completion. Let a fundamental sequence of D(A) be given, 
with scalar product (141). By (143), it will also be a fundamental 
sequence in H, and will have scme limit x’ in H, since H is complete. 
Fundamental sequences of D(A) with scalar product (141), belonging to 
the same class, lead to the same element 2’ of H, i.e. if z, and yp, € 
€ D(A) (n=1,2,...) and ||¥.—@n [la 20 as n>oco, then 
|| % — Ln |] > O also. This follows from (143). We must also show 
that distinct elements x’ and y’ of H correspond to fundamental 
sequences 2, and y, of D(A) with scalar product (141), belonging to 
different classes. 

We have for any z € D(A): 


(Az, i Yn) aa [z, tn — Yn) . 
Passage to the limit gives 
(Az, a’ — y’) = [z, v], 


where the v on the right is a non-zero element of Ha, since the sequen- 
ces 2, and y, belong to different classes in H,. If it turned out that 
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x’ = y’, we should have [z, v] = 0; but this is impossible, since D(A) 
is dense in H,, and the element v € Hy, is non-zero. 

Let us extend the definition of the functional J,(z) to the whole 
of H, by putting 


J, (2) = [2,2] — (yx) — (2,9), (144) 


and investigate the above minimum problem for it. Given any fixed 
y € H, (x, y) is a linear bounded functional on aw in Hy, since 


1 
. < . <— , 
I(x, 9) < [lll - [lel] < Vma iy Illa 


and, by a familiar theorem [123], there exists a unique element 
y € Ha such that 


(x,y) = [%,%9], (CEH; y€H) (145) 
and hence (y, x) = [2,2]. Expression (144) can be written as 
Ty (a) = [2,2] ~ [92°] — [#, 29] = [2 — a, — 9] — (40, to]. 
whence it follows that Jy(2,) < J,(z) for all x € H,, where the = sign 
only holds for x = 2. Moreover, it follows from (145) and the fact 
that H, is dense in H that distinct x, correspond to distinct y € H. 
It follows from the same equation (145) that the set of all solutions 
Z, of the variational problems corresponding to all possible y € H 
is a lineal 1. What has been said enables us to define a distributive 
operator A on J in accordance with 


Az, a (146) 


and this operator has an inverse defined on the whole of H. By Theorem 
1, A is a widening of A. It is natural to write D(A) instead of 1. Notice 
that D(A) G H,. Let us show that A is a symmetric operator in 
H. It follows from (145) that 


(Avy, 2) = [29,%] (%)€D(A); €H,), (147) 
and we have, on putting z = 2: 
(Az, 2p) > 0, 


i.e. (Az, 2,) is real for x) € D(A), whence follows the symmetry of 
A [187]. The inverse 4-1 is defined on the whole of H and is also sym- 
metric, i.e. it is a bounded self-conjugate operator, i.e. A is also self- 
conjugate in H. We have thus proved the following: 
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THEOREM 2. A symmetric operator A, satisfying condition (138) 
with ma > 0, admits of a self-conjugate extension A such that A-} is 
defined in the whole of H and is bounded. 

The construction given above for the widening of a semi-bounded 
symmetric operator is due to Friedrichs (Math. Ann., 109, 4/5, 
1934). The proof is taken from the book by S. G. Mikhlin The Problem 
of the Minimum of a Quadratic Functional (Problema minimuma 
kvadratichnogo funktsionala) (1952). 

Now suppose that the symmetric operator A satisfies condition 
(138) with ma <0. Now, the symmetric operator B= A + (¢e — 
—m,)E (e > 0), for which D(B) = D(A), satisfies the condition 
(Bu, x) > e(x, x) for 2 € D(B), and the above theorem leads us to 
the following: 

THEOREM 3. Every symmetric semi-bounded operator A admits of 
a self-conjugate extension A, such that, given any « > 0, the operator 
[A +(e ~— ma) E]-1 is defined throughout H and is bounded. 

If A is not self-conjugate, it admits of an infinite set of self-conjugate 
extensions. The extension 4 obtained is usually called a Friedrichs 
extension. 

If z € D(A), we had (143) with m4 > 0. Suppose that x € Ha, 
but does not belong to D(A). There now exists, by definition of Ha, 
a sequence 2%, € D(A) (n = 1, 2, ...) such that z, => @ in the norm 
of H, and all the more in the norm of H. 

By writing the inequality for z, and using the continuity of the 
norm, we can say that, with m, > 0, inequality (143) is true for the 
whole of Ha. 

If « € D(A) and hence x € Ha, it follows from (145) that (Az, x) = 
= [x,2] = || 2 ||4, and inequality (143) gives 


(Ax, x) > m,(2, 2). (148) 


We have assumed during the proof that m, > 0. If ma < 0, we 
form the operator B = A + (e — ma) E, where « > 0. By what has 
been said, we have (Bz, x) > e(x, x), where B® = A +(e — maz) E, 
whence (148) follows. We thus have the following result. 

THEOREM 4. Inequality (148) holds for A. 

We now take the case m, > 0 and prove the following: 

THEOREM 5. Given a positive definite symmetric operator A, we have 
H, = D(A") and 


1 
(x, «] = || a?a|*. (149) 
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We first show that H, c D(A). Let x € Hy. There exists a 
sequence z, € D(A) (n = 1, 2, ...) such that || 2 — a |l4— 0, and 
the equation holds for z,: 


1 + 
[2p 0p] = (Aq, %,) = |] A? 2, 2 = § Ad(F,2_,%,), (150) 
ma—0 


where &, is the spectral function of A. 
On noting that 


1 
|e — 2, || > 9, Il Zn =e Em |l4 = || 4? (2, as Lm) ||? > 0 
and that 41’? is closed, we can say that « € D(A), 


1 1 1 
A’ x, => A’e and || A? «|| = |x|], 
(continuity of the norm), i.e. equation (149) holds for z. To complete 
the proof, we have to show that D(d!?) c Hy. Let 2 € D(A), 
and let us show that xz € Hy. 

We take the sequence of elements x, = 8,2 (n= 1,2,...). It 
is clear that x, € D(A) and || z— a, || > 0 as n> 0, 

In view of the convergence of the integral 

+20 
Ad(Z,, X, X) 
m,s—0 


the elements 
n 


1 
Aa, = j \rd%, x 
0 


ma— 


are convergent to 42? z in H as n-» 9, go that 


7 ry 
| A* (& — &m) |? = (Alt — Bn)» Za — Tm) = || Gn — Tm lla > 0 


as m and n—> co. This means that x, form a fundamental sequence 
in H,. Let be the corresponding element of Ha. Hence z,=> 2% 
in Hy. But, as we have seen above, z, => zin H,sothatz =z € Ha, 
which is what we set out to prove. 

When ma, > 0, the operator A-1 is defined throughout H and is 
bounded. We shal] now take the cases when it is completely continuous. 
Let us bring in the operator W, which associates each element z € Ha 
with the same element 2, regarded as an element of H. 

THEorEM 6. The necessary and sufficient condition for A-1 to be a 
completely continuous operator is that the operator W which embeds 
H, in H be completely continuous. 
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Necessity. Let A-1 be completely continuous. Its spectrum is 
a purely point type, and it lies in the interval (0, 1/m,], where the 
eigenvalues, excepting possibly zero, have finite rank, and only the 
point 4 = 0 is a point of condensation of the spectrum [136]. The 
spectrum of the self-conjugate positive operator A-!? has the same 
character: each eigenvalue 4 is replaced by |/4 and the eigenelements 
remain as before. 

Hence A-¥2 is also a completely continuous positive operator. 
We take any set U, bounded in Ha, such that, if x € U, then || @ ||4 < 
< O, where C is a definite constant. We can apply the operator 
Al? to x. Let y = Al%g, so that 2 = A-1? y. We have || y ||? = 
= (Al? x, Av x) = || 2 ||, < 0%, ie. the set of elements A} z is 
bounded in H, and‘the completely continuous operator A-1/? trans- 
forms them into a compact set in H, i.e. a set of H,, bounded in the 
norm of Hz, is in fact compact in H, i.e. the operator W is completely 
continuous. 

Let us prove the sufficiency. Let W be a completely continuous 
operator and U a set of elements bounded in H: if z € U, then || z || < 
<C. We have to show that y = A-1z is a set compact in H. The 
elements y obviously belong to D(A), and we have || y ||, = (Ay, y) = 
= (#,y) <€ || yll < (C/'ma) Il y Ila, whence || y |la < C/f/ma, ie. 
the set is bounded in the norm of H,, and hence, since W is a comp- 
letely continuous operator, is compact in H. The theorem is proved. 


206, The comparison of semi-bounded operators. Let A and B be 
semi-bounded self-conjugate operators. We say that A is not less 
than B, and write A > B, if D(A) S D(B) and 


(Az, x) > (Bu,x) for x€ D(A). (151) 


If A and B have purely point spectra and the eigenvalues of A 
and B can be enumerated in non-decreasing order, whilst taking into 
account their multiplicities, it can be shown by extending the minimax 
principal [136] to the case of non-bounded operators, that A,(A) > 
> A,(B), where 4,(A) and A,(B) are the nth eigenvalues of A and 
B. Let us prove a rather more general theorem. 

THeorEM. Let A > B, and let the spectrum of B, situated on the half- 
line A < B, where B is a certain number, consist of eigenvalues of finite 
multiplicities, which have no points of condensation, less than B. The 
spectrum of A now has the same properties and i,(A) > A,(B) on the 
half-line. 
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It is sufficient to show that, given any 6 < f, where 6 differs from 
an eigenvalue of A or B, the dimensionality of the subspace cor- 
responding to the projector @, is not greater than the dimensionality 
of the subspace corresponding to the projector F,, where 7, and F, 
are the spectral functions of A and B: 


dim ,H <dim F,H. (152) 


Let us assume the reverse inequality. There must then exist in 
%, Hanormalized element z,, orthogonal to the whole of #; H. Notice 
that 2, € D(A), and hence xz, € D(B), since 1) € %,H = (&, — 
— @m,-») H. We have 

+00 é 


(Ay, to) = J AGF, to, %) = J AGF, xo, %0) < 4 |||] = 9. (158) 
Ma- 


man 


On the other hand, since z, | Fs; H, we have 
“boo 400 
(Bx, 2%) = § Ad( Fy 29, %) = ‘) Ad ( F,, Lps Lp) - 
mp—0 


But, inasmuch as F, is constant in some neighbourhood of the 
point A = 6, there exists an « > 0 such that 


oo 


(Bao, %) = AUF %) > de. 
This inequality contradicts (151) and (153), so that (152) is proved. 
Note. Let us return to symmetric semi-bounded operators. Let 
A be such an operator (not necessarily positive definite). Let us define 
the subspace H, for it. Let a be any number satisfying a > —ma, 
so that the operator A+ a£ is positive definite. We shall assume 
that H, consists of all the elements of H,,,5 and we introduce 
into Hy, the bilinear functional [x, y]4, which is an extension of (Az, y) 
to the whole of Ay: . 
[z, Wa = [2, YataE cm a(x, y) ‘ (154) 
The functional [z, y], is continuous in H,. It may easily be shown 
that 


foo 
[z¥la= J AaB, 2,4), (155) 


where &, is the spectral function of the self-conjugate operator 2. 
Notice also that H,4,..¢ consists of the same elements for alla > —ma. 
This follows from the fact that D(A + aE) does not depend on a, 
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and the norms of Ha. a. are equivalent for all a > —m,. Notice that 


the operator A may be self-conjugate. 

It may easily be shown that (151) is equivalent to the condition 

[x, ©), > [z, r\p ’ 

and that the spectrum of the self-conjugate operator A on the half- 
line A < f, where the role of # is indicated in the theorem, can be 
found as the successive lower bounds of (Az, x) for x € D(A) and 
|| x || =1, on condition that xz is orthogonal to the eigenvalues already 
found [cf. 136]. This problem can be replaced by the problem of the 
successive minima of [z, x], for x € Hy and {|x {|| = 1. 


207. Examples on the theory of extensions. I. We have shown that the 
operator D = id/dz in space H = L,(0, +c) has no self-conjugate extensions 
[188]. We shall adhere to the notation of [188] and prove this result by using 
deficiency indices. Remember that a closed symmetric operator A is the operator 
D defined on the set of functions g(x), absolutely continuous in any finite 
interval [0,a] with derivatives of L,(0,+ co) and satisfying the condition 
y(0) =0. The operator A* is the operator D on the set of functions (x), 
satisfying the above-mentioned conditions except for 9(0) = 0. 

Let us form the spaces M,(A) and M_,(A) of eigenelements of the operator 
A*, corresponding to the eigenvalues +i, i.e. the subspaces of solutions of 
the equations A* p(x) = tiy(x) or ip’(x) = +iy(x). We get p(x) = Ce* and 
y(x) = Ce-*. But e* does not belong to L,(0, + 0), and we see that the def iciency 
indices of A are (0,1). The operator A is a maximal operator, L;(A) is the whole 
of L,(0, +0) and L_,(A) consists of the functions belonging to L,(0, + 0) 
and orthogonal to e~* on the interval (0, -+- «). 

If we introduce the orthonormal system of Laguerre functions: 9;,(x) = 
=e * p(x) (k = 0,1, 2,...), where p,(x) is a polynomial of degree k, it is 
easily shown that Up,(x) = 9,%4,(z), where U is an isometric operator map- 
ping L,(A) onto L_,(A), i. A is an elementary symmetric operator. 

2. Let us take the operator L(y) = —y” in space H = L,(— , +0). Let 
A denote this operator on the lineal D(A) of finite functions, having continuous 
derivatives up to the second order. This operator is symmetric. As may easily 
be seen, the conjugate A* is the same operator L(y) on the lineal of functions 
y(zx) with the following properties: y(x) and y’(x) are absolutely continuous in 
any finite interval, whilst y(v) and y’(z) € L,(— 0, +). It can be shown 
that here, y’(x) € L,.(— 0, +) also. The operator A** = 4 coincides with 
A*, ie. A is a self-conjugate operator (cf. 188]. The equations —y” = +iy 
have no solutions in L,(— 0, +), Let us now take the same operator L(y) 
on the interval [0, +0]. Let l’ be the lineal of functions y(z) with the fol- 
lowing properties: y(x) and y’(z) are absolutely continuous in any finite interval 
[0, a], whilst y(z) and y’(x) € L,(0, +00). Let us also find the lineal J of the 
elements y(x) of l’ which satisfy the conditions y(0) = y’(0) = 0 and 


lim (—y’2+ yz’) =0, 
X—> +4. 00 


for any 2(x) €U’. 
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If A is the operator L(y) on 1, A* is the same operator on 1’, where A is a 
closed symmetric operator. The equations —y” = +ty each have one solution 
in Z,(0, + 00) (discounting a constant factor): 

- Me (1 i)x 
yo ’ 
so that the deficiency indices of A are (1, 1). To obtain a self-conjugate extension, 
we have to impose a boundary condition at the end x = 0. In the case of the 
condition y(0) = 0, the operator has no point spectrum, and the continuous 
spectrum fills the interval A > 0. There exists a unique differential solution 


y (x) = — (va cos Ax — os sin VA z} ; 


On forming the resolvent, i.e. the solutions of the equation —y” + (o + 
+ 11) y = f(x) with the conditions y(0) = 0 and +> 0, and passing to the 
limit, we get the spectral function 


+f oo foo = 
in ¥ 2A (2 — 1 in pa 
& f(x) = = — f(j dt — an J sneer) H(t) dt. 


All these results are obtained with the aid of simple working. It is easy to 
show that, if y(z) € D(A*), then y’(x) € L,(0, +0). The theory of linear dif- 
ferential operators of the second order will be discussed in vol. VI. 

3. In [188] we considered the Laplace operator in space L,(D). All the condi- 
tions of Theorem 3 of[187]are fulfilled for the operator A given there, and we 
shall consider the self-conjugate Friedrichs extension of A. The space H, is 
obtained by completing D(A) in the metric 


llulla = V(Au, u) = YJ— su: a [— sue dem VPS uxt de,  |ulx,|? da. 


But this norm is equivalent to the norm of WD) [114], so that H, is 
WD). Remember that W(D) is obtained by completion in the norm of 
wo) of the set of all once continuously differentiable finite functions. But 
it may readily be seen (averaging process) that we could take as the initial set 
infinitely continuously differentiable finite functions. The functional J,(u) 
on u € D(A) has the form 


J; (u) = vl Au-% — 2R (uf)) dz, (156) 
where f(x) € L,(D), or 
Jy(u) =JS[S luxl? — 22 (uf) de. (167) 
Di 


It has a meaning in this form for any function u € WD), and the variational 
problem of [205] consists in finding the function u € wo D) which gives the 
least value of functional (157). We have seen that this problem has a unique 
solution for any f(x) « L,(D). On associating all these solutions, obtained with 
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different choices of f(x) € L,(D), with D(A), we arrive at the self-conjugate 
extension A of A, where 


Au =f. (158) 


Since 4 is a self-conjugate extension of A, we have D(A) © D(A*). But 
functions of D(A*) have generalized derivatives up to and including the second 
order inside D, which are square summable over any domain D’ lying strictly 
inside D, and the operator A* is evaluated on them as a Laplace operator 
[188]. Hence 4 is also a Laplace operator, ie. equation (158) has the form 


= =’ Ux, = f (2). (159) 


We have thus shown that the solution of the present variational problem 
belongs to WD’) as well as to WD), where D’ is any domain lying strictly 
inside D, and that it satisfies Poisson’s equation. 

On the other hand, equation (159) has an infinite set of solutions of Z,(D). 
It is enough to add to the solution given above a harmonic function of Z,(D). 
The condition that the solution belong to WO)(D) distinguishes one solution 
from this class; we in fact obtained this solution from the variational problem. 
This solution must vanish in a definite sense on the boundary S of the domain 
D [113]. This makes clear the connection of our extension of A with the Dirichlet 
problem for Poisson’s equation: 


— du = f(z); uls=0. (160) 


Everything proved above for the Laplace operator is also valid for general 
linear elliptic self-conjugate operators of the second order [IV; 147]. The theory 
of Friedrichs extensions reveals the solubility of the Dirichlet problem for them 
in a generalized sense — in the sense of the solution belonging to W‘!)(D). 
Tn actual fact, it turns out that this generalized solution of the Dirichlet problem, 
corresponding to a Friedrichs extension of an elliptic operator, belongs to WD a) 
or even to WD), provided only that S is a sufficiently smooth surface. This 
was established by O. A. Ladyzhenskaya in The Closure of an Elliptic Operator 
(O zamykanii ellipticheskogo operatora) (Dokl. AN SSSEH t. 79, No. 5, 1951). 
Ladyzhenskaya’s article ‘“‘A simple proof of the solubility of boundary value 
problems and of eigenvalue problems for linear elliptic operators” (Prostoe dok- 
azatel’stvo razreshimosti kraevykh zadach i zadachi o sobstvennykh znach- 
eniyakh dlya lineinykh ellipticheskikh operatorov) (Vestnik Leningradskogo 
universiteta, No. 11, 1955) is concerned with the same problem. 


208. The spectrum of a symmetric operator. We introduced earlier 
the concept of the spectrum of a self-conjugate operator and establ- 
ished a classification of its points. We shall do the same thing in the 
next sections for a closed symmetric operator, and investigate the varia- 
tion of the spectrum with symmetric extensions of the operator. 

Let A be a closed symmetric operator. The number 4 is called a 
point of regular type of operater A if there exists a k > 0 such that, 
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for all x € D(A): 
|| (4 — 4B) a|| > kz] (ee D(A). (161) 


In view of (161) and the fact that A is closed, R(A — AE) must 
be a subspace and (A — A#)-1is a linear operator bounded on R(A — 
— AE). Conversely, if (A — AE£)-t exists on R(A — A£) and is bounded, 
(161) follows from this, i.e. Ais a point of regular type. It is easily shown, 
as in [129], that the set of points of regular type is open. The number 
A is called a regular point of A if (161) is fulfilled and R(A — AE) 
is the whole of H. If R(A — AE) = H, A is not an eigenvalue of A, 
since (A — AE) must be orthogonal to the eigenvalues, and (A — 
— AE)-! is an operator bounded in H [186], i.e. (161) is satisfied. 

If 4 is a real regular point of A, (A — AE)-1 is a bounded self- 
conjugate operator, so that A is self-conjugate. Let us show that 
the set of regular points is open. It is enough to show that, if A, is 
regular, the equation 

OE ey (162) 
is uniquely soluble for any y € H, provided 2 is sufficiently close to 


Ag. Suppose that 
1 


Adel < Tae: 


We rewrite (162) as 
(A—A, E)x4+(4,-—At=y, 
which is equivalent to 
t=(A—A,)(A—A EB) 2+ (A—A EF), 
whilst this latter is uniquely soluble for any y € H [88], since 
(4 — Ag) (A — dy EY <1. 
It may be shown as in [129] that, if A= o + ti and rt # 0, 
|| (A — 2) x|| > |r| || el] (@€D(A)), (163) 


i.e. all non-real 4 are points of regular type. 

Suppose that 4, = o + vi, where t # 0, is a regular point. Now, 
by (163) and what has been said regarding the solubility of (162), 
all the 4 satisfying |A—A,|< |7] will also be regular points. 
On starting out from the regular value 4, and applying the argument 
just given a suitable number of times, every non-real A =o’ + 1% 
in which the sign of t’ is the same as the sign of 1, is seen to bea 
regular point. This proposition can be stated as follows: 
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Lemma. I} one of the deficiency indices p, or q, of an operator A vanishes 
for A= hy (Hy > 0), tt vanishes for all A of the half-plane Yi > 0. 

All non-real 4 are regular points in the case of a self-conjugate 
operator. Let us give an example of a closed symmetric operator 
A which has no regular points. Let H be Z,(0, 1) and A be the operator 
td/dz, considered on the set of functions y(z) such that ¢(z) is absolut- 
ely continuous in the interval [0,1], 9(0) = o(1)=0 and (xz) 
belongs to LZ, (0, 1). This is a closed symmetric operator [188]. Given 
any choice of A, the function e~* belongs to Z,(0, 1) and is ortho- 
gonal to all the functions y(z) expressible in the form 


. a 
yl) = 1 S2@ _ aga, 


where g(x) € D(A), i.e. e-* is orthogonal to R(A — AE), whence it 
follows that A is not a regular point. 

We define the spectrum of A as the set of points of the 4 plane 
complementary to the set of regular points. This is the set of the points 
4 at which (A — AH) has no bounded inverse defined throughout 
H. The kernel of the spectrum of A is defined as the set of points 
complementary to the set of points of regular type. The spectrum and 
the kernel of the spectrum are closed sets, and the former (the spectrum) 
contains the latter (the kernel). The kernel of the spectrum must lie 
on the real axis. The spectrum may fill the entire plane, as is clear 
from the above example. 

If A is a self-conjugate operator, the kernel of the spectrum coincides 
with the spectrum [189]. 

It may easily be seen that the kernel of the spectrum of A belongs 
to the kernel of the spectrum of any closed symmetric extension of 
A. This follows from the fact that, if A belongs to the kernel of the 
spectrum of A, this is equivalent to the existence of a sequence z,, 
of normalized elements of D(A) suchthat(4—AE) z,=>0 as n>, 
This property is obviously preserved with the above-mentioned exten- 
sions of A. 

We can now classify the points of the kernel of the spectrum of an 
operator A. As a preliminary, we take the case when A is an eigen- 
value of A (A is a real number). Let P, be the subspace of correspond- 
ing eigenelements (including the zero element). We can write the lineal 
D(A) as the orthogonal sum 


D(A) = P, ®D,(A), (164) 
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where D,(A) is the lineal consisting of elements contained simultan- 
eously in H © P, and D(A). Let A, denote the operator A defined 
on D,(A) and coinciding with A on this lineal. If 4 is not an eigen- 
value, P, is absent and D,(A) is the same as D(A). We shall assume 
in this case that A, is A. We can say that (A, — AZ), regarded as 
an operator in H © P,, has, for any A, an inverse (A, — AE)-}, 
defined on &(A — AE). The A for which (A, — AE)-1 is an unbounded 
operator belong to the kernel of the spectrum. This part of the 
kernel is described as the continuous part. The eigenvalues also 
belong to the kernel, and this is described as the point part of the 
kernel. Every point of a spectral kernel belongs to one of these parts, 
though it may belong to both. We shall say that an eigenvalue belongs 
to the purely point part of the kernel if (A, — 2#)~1 is bounded on 
R(A — AE). Every point of the kernel belongs either to the continuous 
or to the purely point part of the kernel, and indeed, only to one of 
these parts. When A is given a closed symmetric extension, the con- 
tinuous and point parts of the spectral kernel can only be widened. 
It may easily be seen that, when A and A, are closed, the continuous 
part of the spectrum of A is characterized by the fact that R(A — AE) 
is a non-closed lineal. 


209. Some theorems on extensions and their spectra. We shall 
start by proving the following theorem: 

THEOREM 1. If A is a real point of regular type of a closed symmetric 
operator A, a self-conjugate extension A of A exists for which A is a 
regular point. 

It can be assumed without loss of generality that A = 0. It follows 
from the conditions of the theorem that R(A) is a subspace, and the 
bounded inverse A-} is defined on it [208]. We have to show that a 
self-conjugate extension A of A exists for which R(A) = H [189]. 
Given the hypotheses, we have 


H = RA) QU, 


where U is the subspace of all solutions of the equation A* u = 0 
[185]. We know that, in the present case, R(A*) = H [187], so that 
there exists for any u € U at least one solution of the equation 


A*¥v=u. (165) 


Let V denote the lineal of all solutions of (165), when uw runs over 
the whole of U. Obviously, U © V C D(A*). Let U denote the lineal 
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of elements of V orthogonal to U, and let us form the lineal J of ele- 
ments 2 expressible as 
a=y+i, (166) 


where y € D(A) and % € &. Let us show that expression (166) for x 
is unique. If this were not the case, a non-zero element z would exist, 
belonging simultaneously to D(A) and @%. But then Az € R(A) and 
Az = A*z€U, since z € U, and z€ V. But B(A) | U and there- 
fore Az =0 ie. 2€U, which in conjunction with z¢€ 0 yields 
z= 0. This proves the uniqueness of expression (166), i.e. 7 is the 
direct sum: D(A) + &. We now define the operator A on lineal 1 
by putting Ay = Ay + A* u, and write D(A) for 1. Obviously, 
ACA C A*. Let us show that A satisfies all the requirements of 
the theorem. On the lineal D(A), the operator A coincides with A, 
and on U the operator A* gives the whole of U, as follows from the 
definition of ( and the fact that A* u = 0 for u € U. Thus R(A) = H. 
It remains to prove the symmetry of A on D(A) [187]. Notice first 
of all that 


(A* 7,,%) = (a, A*%,) =0 (ad, and %,€0). 


Let x, and 2, € D(A). Now, by (166), 7 = y, + %, % = y+ uy, 
and we have 


(Aa, t,) = (Ay, + A* %,, 7) =(y, A* %) + (A* ti. 4.) = 
= (y,, A* 2) + (%, AY.) = (Y, A* x.) + (t,, Ay,) + (%,, A* &) = 
= (y;, A* a) + (@,, An,) = (a, Az). 


The theorem is proved. 

It can be shown that, if A-1 is completely continuous, 4-! is also 
completely continuous. For a detailed study of such extensions, both 
of abstract and differential operators, see M. I. Vishik (Trudy Jfos- 
kovskogo matematicheskogo obshchestva, t. 1, 1952) and L. Hormander 
(Acta Mathematica, 94, 3—4, 1955). 

CoroLLaRy 1. If the real number A belongs to the purely point 
part of the spectral kernel, a self-conjugate extension A of A exists for 
which A also belongs to the purely point spectrum with the same subspace 
P, of eigenelements as A. 

It may easily be seen that A, considered in the subspace H, = 
= HOP,, is a closed symmetric operator and satisfies the conditions 
of Theorem 1. It thus admits of a self-conjugate extension 4, in H,, 
such that 4 is a regular point. The operator A, with the domain of 
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definition D(A) = D(A,) ® P,, coinciding with A, on D(A,) and with 
A on Pj, is obviously the extension indicated in the corollary. 

CoroLuary 2. If A is a maximal but non-self-conjugate operator, 
the continuous part of the spectral kernel of A fills the entire real axis. 

For otherwise, A would have self-conjugate extensions. 

CoroLuary 3. If there exists a real 2, not belonging to the continuous 
part of the spectral kernel of A, the deficiency indices p, and q, are the 
same for any A (FA > 0). 

This follows from the fact that, given the hypothesis, A has self- 
conjugate extensions. 

THEOREM 2. Let the real A be a point of regular type of a closed 
symmetric operator A and U = HO R(A — AE). There now exists a 
self-conjugate extension A of A for which A belongs to the purely point 
spectrum and U ts the eigensubspace corresponding to i. 

We can assume without loss of generality that A = 0. Notice that 
R(A) and U are subspaces, U being the set of solutions of the equation 
A*z=0. Let D(A) + U denote the set of elements of the form 
x2==y-+z2, where y € D(A) and z€ U. 

This form is unique. For otherwise, we should have a non-zero ele- 
ment 2, such that x, € D(A) and x, € U. It follows from this that 
(%), Av) = 0 (2 € D(A)) and (Azp, x) = 0. But, since D(A) is dense. 
in H, we get Ax, = 0, and this contradicts the fact that A = 0 is. 
a point of regular type. 

We can therefore define an operator J on the direct sum D(A) = 
= D(A) + U by putting Ar = Ay, if x= y+2, where y € D(A) 
and z € U. 

The symmetry of A follows immediately, since (Az, y)—0 and 


Let us show that J is self-conjugate. Let 
(Ax, u) = (z,u*) (x€D(A)). (167) 


Since A* € A*, we can say that uw € D(A*) and u* = A* u. On 
writing © as © = y + 2 as above, we obtain 


(Aa, u) = (Ay, u) = (y, w*) + (2, u*) = (y, Au) + (z,u*), 
or 
(Ay, u) = (Ay, u) + (z, u*), 


(z,u*¥) = (z, A¥u) =0. 
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This equation holds for all z € U, so that u* = A* u € R(A), and 
consequently there exists an element y, € D(A) such that Ay, = u*, 
whence A* u — Ayy = A*(u — yo) = 0, ie. U—Yo=2%,€U and 
U = Yq + 2 € D(A). By (167), A is in fact self-conjugate. The remain- 
ing assertions of the theorem regarding the properties of A follow 
at once from its construction. 

CoroLuary. If the real A belongs to the purely point spectrum of A, 
a self-conjugate extension A can be formed, for which i also belongs 
only to the point part of the spectral kernel, where the subspace of eigen- 
elements of A corresponding to A is the same as the subspace of eigen- 
elements of A* corresponding to A. 

The proof is similar to the proof of corollary 1 of Theorem 1. 


210, The independence of the deficiency indices on 4. We remarked 
above that the deficiency indices p, and g, do not depend on the choice 
of the complex number / from the upper half-plane. We shall prove 
this assertion in the present section. We note first of all that, if a 
linear operator B (not necessarily bounded) maps one-to-one a 
subspace V onto a subspace W, V and W have the same dimensions: 
dim V = dim W. This follows from the fact that the linearly inde- 
pendent elements {2z,,2,, ...} of subspace V are mapped by 8B into 
linearly independent elements {Bz,, Bx,, ...} of subspace W, and 
vice versa. Let us introduce a further definition. We say that m is 
the dimensionality of the lineal / with respect to the modulus of 
the lineal l’ if 1 contains m and not more than m linearly independent 
elements, no linear combination of which belongs to l’ (excluding the 
case of all the coefficients vanishing). We usually write m = dim / 
(mod 1’). The number m may in fact be infinite. 

Let A bea closed symmetric operator and p,, q, its deficiency indices 
corresponding to some 4 from the upper half-plane. Now, as we have 
seen [202]: 

ee (168) 
H = L(A) © Mx(A), 
where M,(A) is the set of all the zeros of the operator A* — AE, 
L,(A) is the set of all the elements of the form y = (A — AE) & for 


xz € D(A), and similarly for M43(A) and L(A). 
Further [203]: 


D(A*) = D(-) + M,(A) + MWA), (169) 
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and any symmetric extension of A may be formed with the aid of 
an isometric operator establishing a one-to-one correspondence. 
between subspaces N,(A) and Nj(A) (of the same dimensionality) of 
spaces M,(A) and M,(A) [202]. 

It follows from (169) that 


P, +9, = dim D(A*) (mod D(A)), (170) 


so that the sum p, + q, is independent of A. Suppose first that it 
has a finite value. We form a maximal extension A, of A and take 
a A =A’ of the upper half-plane. We can assume without loss of 
generality that p, <q,. Hence it follows that the deficiency indices 
of A, with A=J/’ are (0, ry), where 7, =q,— py. By the lemma of 
[208], the deficiency indices of A, are (0, 7,) for all A of the upper half- 
plane, where 7, = g, — p, [202]. It further follows from (169) that. 


D( AG) = D(Ap) + Mj Ap) , 
and we have 


r, = dim Mj(A,) = dim D(Af) (mod D(A,)), 


whence it is clear that r, = gq, — p, is independent of 4. Thus 7, 
and q, are independent of A. Now suppose that p, + q, is equal to 
infinity. If we have p, = co and q, = °° for some J’, self-conjugate 
extensions of A are possible, and hence p, = ©, gq, = © for any A. 

It remains to consider the case when one deficiency index is finite 
and the other infinite for some A = 2’. 

Let p, be finite and g, = co. It follows at once from the above 
that p, will be finite and g, = °° for every 4. We only need to show 
that p, does not depend on A. This is easily seen from the formula 
[202]: 

D(A) = D(A) + (E — Vo) M,{A), (171) 


where A, is a fixed maximal extension, and V jis an isometric operator, 
dependent on the choice of A, and A. In view of the existence of (H — 
—V,)-1, it follows from (171) that 


Pp, = dim D(A,) (mod D(A)), (172) 


which shows in fact that p, does not depend on A. 

It follows from Theorem 1 [209] that, if a real A =A, exists, which is 
a point of regular type of a closed symmetric operator A, A admits 
of self-conjugate extensions, i.e. given the existence of a real point 
of regular type, the deficiency indices (p, g) (independent of A) are the 
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same. They are equa] to the dimensionality of the subspace U of 
eigenelements of A* corresponding to the eigenvalue A = A). 

For, on taking 4, = 0 and writing 4 for the self-conjugate extension 
of A for which 4 = 0 is a regular point [209], we have in the present 
case: H =R(A) @U; D(A) = D(A) + A-1U, the sums represent- 
ing the elements on the left-hand side being unique; consequently, 


p = dim D(A) (mod D(A)) = dim A-1U =dimU. 


211. The invariance of the continuous part of the spectral kernel 
in the case of symmetric extensions. We shall discuss in this section 
the closed symmetric extensions of a closed symmetric operator A, 
on the assumption that the deficiency indices (p, qg) of A are finite. 

We start by proving a simple lemma. 

Lemma. If U and W are two subspaces, the second of which ts finite- 
dimensional, then 


V=U+W, (173) 


i.e. the set of elements x = y+ 2, where y € U and z€ W, is also a 
subspace. 

We can obviously assume that W has no elements in common with 
U (except the zero element). Let (w,, w,, ..., Wp) be the base of W. 
We write each w;, as w, = wi, + wy, where uw, € U and wy 1 U. Let 
W” denote the linear envelope of the wy (k = 1, 2, ...,n). The set 
V can be written as the orthogonal sum of two subspaces: 

V=UOW’, 
and the Jemma is proved. 

THEOREM. The continuous part of the spectral kernel of any closed 
symmetric extension A of the operator A is the same as for A. 

We have seen that the continuous part of the spectral kernel cannot 
diminish on extension of A [208]. Suppose that it increases, i.e. a 
real number A, exists, which does not belong to the continuous part 
of the spectral kernel of A, but is contained in the continuous part 
of the kernel of A. Now, R(A — A, E) is a subspace, and R(A — A, E) 
a non-closed lineal. 

On taking into account the formula for D(A) [203] and the fact 
that the deficiency indices of A are finite, we can write 


R(A —1,E)=R(A—A, E)+W, 


where W is a finite-dimensional subspace. But this last formula, and 
the fact that R(4 — A, E) is non-closed, contradict our lemma. 
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212. The spectra of self-conjugate extensions. We have seen that, 
if A is a real point of regular type for A, there exist two types of self- 
conjugate extensions of A: for one, 4 is a regular point of A, and for 
the other it is an eigenvalue with multiplicity equal to the deficiency 
index of A (the indices are equal). 

Let us supplement these results. 

THEOREM 1. I} the deficiency indices (p, p) of a closed symmetric operator 
A are finite, given any self-conjugate extension ‘A of A, the multiplicity 
of any eigenvalue can be raised by not more than p, and a real 4 which 
is not an eigenvalue of A cannot be an eigenvalue of A of multiplicity 
higher than p. 

Suppose that A is not an eigenvalue of A, but is an eigenvalue of 
A of multiplicity k > p. It follows from (172): 


p = dim D(A) (mod D(A)) 


and k > p, that there is an eigenelement of A belonging to D(A), 
i.e, Ais an eigenvalue of A, which contradicts our hypothesis. We have 
thus shown that k < p. The case when A is an eigenvalue of A is 
similarly considered, using the operator A, [208]. 

THEOREM 2. If A is a semi-bounded closed symmetric operator and 
its deficiency indices (p, p) are finite, the spectrum of any self-conjugate 
extension of A lying to the left of the lower bound of A can only consist 
of a finite number of eigenvalues, the sum of the multiplicities of which 
does not exceed p. 

We can assume without loss of generality that A is a positive 
operator. Let J, be the spectral function of the self-conjugate extension 
A. Let us show that, given any 0 > f >a, 


dim AS, H = (6, -%,)H <p. (174) 
Suppose the reverse inequality holds: 
dim 43, H>p. (175) 


We know that 4%,2¢€ D(A) for x€ H and p=dim D(A) 
(mod D(A)). It follows from this that, by (175), there is a norma- 
lized element x € D(A) in the subspace 4%, H. But now, 


+20 B 
(Az, x) = (Az, 2) = i) Ad(@,, x, Z) = { Ad(¥, 2,2) <B <0, 


which contradicts the fact that A is positive. Thus (174), and therefore 
the theorem, is proved. 
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213. Examples. 1. We considered in [188], in space H = L,(D), the operator A 
which is defined on all smooth finite functions in D and is the operator of 
differentiation of these functions: ; 

Di = (ik 5 O° P(2) 


~~ - 176 
Ly e:ere. Orh ( ) 


We proved that 4 is a symmetric operator, having a bounded inverse on 
R(A). Hence it follows [209] that A admits of a self-conjugate extension A 
such that the equation 


Ap=y (p € D(A)) (177) 


is uniquely soluble for any p(x) € Z,(D). The domain of definition of A is sup- 
plemented with this extension by the functions v(x) of L,(D) for whichA*A*y = 
= 0. These functions have generalized derivatives D* and D*D* and are found 
from the equations f 
2. 
kk Oo(@) 
i aa eres Wace 

From these, we choose for D(A) those that are orthogonal to the solutions 
of the equation D* u = 0. The operator A has the form D* on D(A), where 
D* is the generalized derivative (176). 

2, Let us consider the operator 
d?2 
A=— —- 
da? 


in space H = L,(0, +00), defined on all smooth functions, finite close to 
z= 0 and = +, D(A*) is the set of all functions y(x) with the following 
properties: g(x) and ¢’(x) are absolutely continuous on any finite interval 
[0, a], p(x) and (x) € L(0, +0). As pointed out, y’(x) € £,(0, +0) also. 
For g(x) € D(A*), we have A*y(x) = —g’(x) [188]. D(A) consists of all the 
elements of D(.A*) which satisfy the conditions g(0) = y’(0) = 0. 

The following statements are easily verified: (a) A is a positive operator; 
(b) the deficiency indices of A are (1, 1); (c) the continuous part of the spectral 
kernel coincides with the semi-axis 0 < A < +00, and the same is true for 
any self-conjugate extension of A; (d) any symmetric extension of A is a self- 
conjugate extension. It is defined on all the elements of D(A*) that satisfy 
one of the two conditions »’(0) — hgy(0) = 0, where hk is a fixed real number, 
or ¢(0) = 0. The latter condition corresponds to a Friedrichs extension, and 
the operator stays positive and has a purely continuous spectrum, consisting 
of the semi-axis 0 < 4 < +0.Ifh < 0, the self-conjugate operator correspond- 
ing to the condition y’(0) — Ag(0) = 0 has a purely continuous spectrum. When 
h > 0, it has one simple eigenvalue 4 = —h?. 


214, Infinite matrices. We discussed in [200] integral operators 
whose kernels satisfy conditions (107) and (108). We can consider 
similarly the operators in J,: 


Yi = > Vix Ly (178) 
k=l 


(¢=1,2,...), 
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represented by matrices such that a,; = aj, and 


dj = S [aix|? = > \a,;|" <I 06 (179) 
i=l 


{=1 
(k= 1,2,...; d, > 0). 


Series (178) are here absolutely convergent for any element 2x of J,, 
but the series consisting of | y; |? is not necessarily convergent, i.e. 
(Y1, Ya ---) may not be an element of /,. Let D(A,) denote the lineal 
of x € 1, such that 


> 4 |e] < 99, 
k=1 


and D(B) the lineal of z such that (y,, y, ...) € d. It can be shown, 
as in [200], that D(A,) is everywhere dense in J, and that D(A,) © 
© D(B). On further writing A, for the operator defined in D(A,) 
by (178), and B for the operator defined in D( B) by the same formulae, 
we can say that A, is a symmetric operator and that B= Ap [ef. 
200]. Notice that, by (179), all the base vectors belong to D(B) and 
even to D(A,). The necessary and sufficient condition for B to be 
self-conjugate is that, given any xz and y € D(B), we have [cf. 200]: 


S(S 4 2) Hi = >, ( Say, yi); (180) 
k=1 k=1 i=l 


i=l 


where, since aj, = Q,;: 


i 


i] 


oo 
Qin Yi = > Uni Yi 
1 i= 


The symmetric operator A, may in fact be non-closed, and we can 
introduce the further new operator A, which is the closure of Ao, 
as we shall show. Let D(A) be the lineal of x € D(B) such that 


(Ba, y) => (2, By) 
(y€ D(B)) (181) 


for any y of D(B), and A is the operator given by (178) on D(A). 
On the other hand, Aj* is defined on the lineal D(Aq*) of elements 
x such that (By, x) = {y, z*) for any y of D(B), and, since A>* c 
c Aj, 2* is given in terms of x by (178), i.e. z* = Bz. On comparing 
this with the definition of A, it will be seen that Aj* coincides with 
A. But A}$* is the closure of A), i.e. A is the closure of A,, and A*= 
== A} = B. We must mention one property of the lineal D(A) and of 
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the operator A. We shall describe as a ‘finite element” any element 
(21, %, ...) of 2,, which has a finite number of components 2, dif- 
ferent from zero. Let D(A’) be the lineal of ‘finite elements” and 
A’ the operator (178), defined on this lineal. Since a,; = aj, this 
operator is symmetric, and any element of D(A’) obviously belongs 
to D(A,), ie. A’ © A,; consequently, on writing A’ for the closure 
of A’, we have A’ € A, and hence A* C (A’)*. Let e, be the kth 
base vector. Any element y of (A’)* must satisfy the equation 
(Bex, y) = (ex, y*), where y* = (A’)* y. On writing y; and y*; for 
the components of y and y*, this equation can be written as 


> KY = Yo 10. Y= SO Yi 

f=1 i=1 
whence it is clear that y € D(A*) and y* = A* y. On comparing this 
result with A* C (A’)*, we see that (A’)* = A*, so that A** = A’. 
But we have A** = A, and hence A’ = A. This result can be stated 
as follows: 

THEOREM. If x € D(A), there exists a sequence €, of ‘‘finite elements” 
such that &, => 2, Aé, > Az and A= A’. 

The fact that A is self-conjugate means that A* = A. If this is not 
the case, the deficiency indices of A are defined by the dimensionality 
of the subspaces formed by the solutions of the equations Av = iz 
and Az = —iz, and we can apply the above extension theory. Let 
us show further that, if the matrix a;, is real and the complex element 
x’ + 2" ¢ belongs to D(A), then x’ and 2” also belong to D(A), so that 
a’ — 2" i € D(A) also. For, by the theorem proved above, there exists 
a sequence é, = é, + {4 of “finite elements’? such that || 2 — 
— &, |? = || a’ — & IP + |] 2” ~ 6, |P->0, and || Ax — Abn [PP = 
= || Aw’ — AX ||? + || Az” — AE ||? 0, whence it follows that 
| 2’ — E, || 0, || Ae’ — AG, || 90, |]2"— & ||>0, || Ax” — 
— A& ||» 0. In view of the fact that A = A’, we obtain our asser- 
tion. 


215. Jacobian matrices, Let us apply the above results to the 


Jacobian matrix: 
@y 05 9 0 0 sa. 


Bg 0 Oy OO cass 
0 Bb, a,b, 0... |)> (182) 
OO: S05 tgs Ds. sa 
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where the a; are real and 6; > 0. Condition (179) is obviously fulfilled. 
We start with k = 0 when enumerating the base vectors. 
We form the real polynomials P,(A) in accordance with [cf. 167]: 


AP (A) = by Pag alA) + Oy Py(A) + By —y Py—s(A) 5 (183) 
P_A)=0; P,(4)=1, 
from which it follows that 
€, = P(A) eq, (184) 


where the base vectors have been denoted by ex. 
THEOREM 1. If the series 


co 


> | P(t) |? (185) 
k=0 


ts convergent, the operator A is non-self-conjugate. 

In view of the convergence of (185), we can form an element 2 
of l, with the components 2, = P,{7), so that (x, e,) = P,(z). On noting 
that Aey = dyiy Cy + Ay Ce + dy x4, and using (183), we get 

(Aey, ©) = By_y Py—y(t) + ay Pat) + Oy Pag r(t) = tPy (7) , 
whence we can write (Ae,, x) = (e,, ix), since (e,, 2) = P,(%). Since 
A and the scalar product are distributive, we have (Ay, x) = (y, iz) 
for any ‘‘finite element’ y and, by the theorem of [214], this equation 
holds for any y of D(A), so that z € D(A*) and A* x = iz, whence it 
follows that A is non-self-conjugate. 

THEOREM 2. I? series (185) is divergent, A is self-conjugate. 

It is sufficient to show that A* does not have the eigenvalues +7. 
Let us suppose the opposite. Let A* a = iz, where the element 
X(Xo, Ly, Ly, ...) is non-zero. By the definition of A* and the fact that 
e, € D(A), we have (Ae, x) = (ex, iz) or (x, Aey) = i(x, ey) = imp, 
Le. (©, Oyy Cy + Ay Ce + Oy x41) = tty. On expanding the scalar 
products, we get dy, Xp-y + Ay Uy + Ox Le4, = iz,. On using (183) 
and the method of complete induction, we have x, = P,(i) 2, and 
t, #0. But this contradicts the divergence of (185). If we replace 
t by (—7) in (185), another divergent series is evidently obtained, since 
P.{—1) = P,{i), and, as above, A* does not have the eigenvalue 
(7); the theorem is proved. Thus the divergence of series (185) is 
necessary and sufficient for A to be self-conjugate. On repeating 
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word for word the proofs of the last two theorems, it can be shown 
that, if the series 


= | Pala) P (186) 
is convergent for any given non-real a, then a and a are eigenvalues 
of A*, and if series (186) is divergent for some non-real a, then a 
and a are not eigenvalues of A*, whence it follows that A is self- 
conjugate, ie. series (185) is also divergent. Conversely, if series 
(186) is convergent for some non-real a, A* has non-real eigenvalues 
and A is not self-conjugate; series (185) is also convergent. These 
remarks lead to the theorem: 

THEOREM 3. Only the following two cases are possible: series (186) 
ts divergent for any non-real a or is convergent for any non-real a. In 
the first case A is a self-conjugate operator, whilst it is not self-conjugate 
in the second case. 

Further, it follows at once from the proof of Theorem 2 that, if (185) 
is convergent, the components of the eigenelements of A* corresponding 
to the eigenvalue ¢ satisfy the equations 2, = P,(t) x) (k= 1,2, ...), 
where 2, is arbitrary and non-zero, i.e. the subspace M;(A) is one- 
dimensional. Similarly, Mf _;( A) is one-dimensional. It is evidently ob- 
tained from M,(A) by replacing the elements x, by the conjugates. 
Thus the deficiency indices of A are (1, 1) in the second case. The ele- 
ments z; and x_; of subspaces (A) and M_,(A) can be determined 


up to an arbitrary complex factor from 


y= gd Pylé) cys Bp = Pil — te, 
k=0 k=0 
and the elements of subspace D(A,) of the self-conjugate extension 
A, of A are defined uniquely by v = 2,4 + a2, where x4 € D(A), a 
is any complex number, zr, = i(e?? z_;+e* a2) and 0<@ < 2z. 
Let &, be the spectral function of A in the first case or of any given 
A, in the second case, and (A) = (&; €9, €9). We have, precisely as 
in [167]: 
+ 00 
” Pay Payagay = [2 TEA" 
| KAA) tA OCCA po ee 


—9°0 


boo +00 
(Aey,e) = J AP,(A)PiA)dg(d), = J Prld) dF 00, 


—oo 
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where A has to be replaced in the second case by A,. The elements of 
matrix (182) are obviously given by 
+00 
Qiy= J AP,(A) P(A) doa). 

THEOREM 4. The polynomials P,(A) form a closed system with 
respect to (A). 

Let @,(4) be functions equal to unity for —co < 2 < yw and zero 
for 4 > yp. Any function a(A), taking a finite number of values 
@j,Q,.-.,Am, each value a, being taken on a finite interval, can 
evidently be written as a finite linear combination of functions 
¢,(A) with different wu. If, given any u, we can prove the closure equa- 
tion for ,(A), it must hold, by virtue of the generalized closure 
equation, for any linear combination of ,(4), and hence for all 
functions 2(A) of the type indicated. But the lineal of such functions 
is everywhere dense in L, with respect to o(A) [60], so that the P,(A) 
form a closed system [60]. It is thus sufficient to prove the closure 
equation for ,(A). Let us evaluate the integral of /(4) and the 
Fourier coefficients of this function: 


+00 +400 
J FA) deld) = S$ del) = elu)i ae = J 9y(A) Pal) de(d) = 


= § P,{a)do(d). 


(2°, 4] 
We have to show that, for any p: 
ely = J Pylajdg(a)- J Py(A) dgla). 
k=0 (~©, 4] (—<2, 4] 
In view of the closure equation, we have 


o(#) = ||, e ||? = 2 (B,, €ys x) (Cx B &0) » 


=0 


and it is sufficient to show that 


(Fp 801 x) = (Ch By eo) = J Pad) dela), 
ane 
the right-hand side of which is real. But this last equation is a direct 
consequence of the earlier integral form of e; [cf. 192], and the theorem 
is proved. 
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We shall now give a simple sufficient criterion for matrix (182) 
to be self-conjugate. It follows at once from (183) that 


p, Peele) Pala) — Pris) Pia) _ 
f a—a = 
Py(a) Py_s(a) — Pa) Pyr(a) 


a—a 


, 


= |P,(a)P + b-1 
and, on summing over k from k = 0 to k = n — 1, we obtain 
n-1l = 
eo | P,(a) |? — 04 P,{a) Py-1(a) = P,{a) P,-4(a) 
k=0 a—a : 
and, in particular, with a = 17: 
n—-1 —— 


D | Pyla) F = by Pale) Prev) = Pal@) Prev(i) _ 


k=0 2% 
= Dis SF [P,,(¢) P,-a(t)] : 
Since P,(A) = 1, the left-hand side is >1, whence it follows that 


1 


bat < JLP,(t) P,,-1(4)] < | P,(4) | 7 | P,—-1(0) | < 


< + (| Pali) + | Pale) PD, 


and we get, on summing from n= 1ton=m-+1: 


m 1 m-+1 
25 <2 IPA, 


whence we have, in view of Theorem 2: 

THEorEM 5. If the series formed from the 1/b, ts divergent, A is a 
self-conjugate operator. 

We shall mention without proof two facts directly connected with 
the above exposition. It can be shown that, if A has deficiency indices 
(1, 1), series (186) is convergent for any value of a. Jf A is self-con- 
jugate, (186) is divergent for all real a except those that correspond to 
the point spectrum of A, if this latter exists. Moreover, with deficiency 
indices (1, 1), every self-conjugate extension of A has a purely point 
spectrum (see N, I. Akhiezer, Infinite Jacobian Matrices And The 
Problem Of Moments (Beskonechnye matritsy Jakobi i problema 
momentov), (Uspekhi matem. nauk, t. IX, 1941). 
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The Hermitian polynomials may be taken as an example of a Jacobian matrix. 
We have defined these polynomials by the equation 


dk e 
Hy (A) = (— Ike (0-¥) (187) 
and have had the relationship 
1 
AH, (A) = a Fen (A) + kHy_, (A) (188) 
and the integral equation [III,; 157}: 
+20 ‘ 
§ ea" HE (A) da = okt yx. (189) 


In order subsequently to obtain normalized polynomials, we introduce instead 
of (187) the polynomials 


— 1)jk k 
Pao es 


a "fae 
vie ef a (e-2), (190) 


after which we can rewrite (188) as 


AP, (2) = IE 1 Pyay (2) + y= Py-s (2), 


where P,(A) = 1. Thus, if we take a Jacobian matrix by putting 


a, = 0; b, = je (i =0, 1, 2,...), 


we in fact arrive at polynomials (190), where the relationships hold (cf. IIT; 
220): 
lf, 
ae if e-# P, (a) P(A) dd = 
¥x 


—oo 


0 for k #1, 
1 for k = 1, 


+00 


f e—** AP, (2) P, (2) da = 


a 0 for | k—1] #1, 
Vx 


+ for 7=k+1. 
It follows from Theorem 6 that A is a self-conjugate operator in the present 
case. It can be shown by using the integral formulae written above that, for 
the operator <A: 


—oo 


a 
I 
y= — | o-#' da. 
@ (4) aye 


A has a simple continuous spectrum, distributed over the entire interval} 
(= a, + oo). 

216. Matrices and operators. Let us investigate the connection between 
matrices and symmetric operators in Hilbert space H. Suppose first that a 
bounded self-conjugate operator A is given in this space, and let 9,9, ... 
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be any complete orthonormal system of elements of H. If the formula 
ny = (AP Pn) = (Pr APn) cin? 
(Ogn = Gpx) 


defines the elements of some matrix a, we have 


r= Si ant (192) 
k=l 


where the 2, are the components of the element 2, i.e. 2, = (2, ¢,), and 2% 
are the components of the transformed element, i.e. z, = (Az, y,). Thus, given 
a definite choice of base vectors, the operator A is given by the matrix of 
(192). If we choose another system of base vectors y,, y.,..., and U is the 
unitary transformation such that Up, = y, (k= 1,2,...), where ty, = 
= (UG, Py) = (Yq, Pp), the operator A will correspond to the matrix with 
slements 


dng = (AY Pn) = x (APg Ps) (Pn Ys) = >» (Pir APs) (Pm Ps) = 
s= 


s=l1 


= x (Par Ps) SS (Pro Me) (APs, Ms (193) 
sol t=1 


oo 


where we have made use of the generalized closure equation (18,) of [121]. 
On using the notation introduced above, we can write 


bin = SY Usn Ant: (194) 


If we apply this formula to b;,, = b;,, then pass to the conjugates and replace 
the letter s by ¢ and t by ¢ in the right-hand side, we obtain 


ba= Un SS Usa (195) 
Similarly, 
an = Suns D beta = > Un > Unsbs: (196) 
s=1 t=1 t=1 s=1 


Conversely, if a {Gn,} (Gi, = Gn) is @ given matrix, satisfying the condition 
for boundedness [163], and the system of base vectors 9, (k = 1,2,...) is 
fixed in H, formulae (192) define a bounded operator A, self-conjugate in H. 
The kth column of the matrix {a,;} gives the components of the transformed 
element obtained from the base vector ¢;, so that we can write: 


AQ, = > GnkPn- 
n=1 


The connection between matrices and unbounded operators is more complicated. 
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We shall in future describe as C-matrices those that satisfy the symmetry 
condition (ayn = ay) and condition (179). Let F be a closed symmetric operator 
with the lineal D(F), dense in H. We take a complete orthonormal 
system of elements g, such that all the g, belong to D(F), and we define the 
elements of the matrix {a,,} by (191) with A replaced by F. In view of the 
symmetry of # and the closure equation, we can say that {a,;,} is a C-matrix. 
On applying the generalized closure equation to the scalar product (Fx, 9,) = 
== (x, Fg,), on the assumption that x ¢ D(F), we obtain 


oo co 


tn = (Fx, p,) = a | L, Py) (FP Pas Pe) = 2° nk Tks (197) 


i.e. F is given by (192) in the base vectors g;. The lineal D(#’) obviously contains 
all “finite elements’, i.e. all finite linear combinations of base vectors, and, 
since the operator A defined in [214] on the basis of the matrix {a,,} is the 
closure of the operator A’ defined by (192) on the lineal of ‘‘finite elements”, 
we can say that A C F; consequently, F* c A*, ie. F* is also given by (192) 
on the corresponding lineal D(F*). If F is an extension of A, by Theorem 1 
of [186], D(#*) is only part of the D(B) that we defined in [214]. In the present 
case the same matrix yields different operators. 

If, instead of y,, we take another system of base vectors y,, which also belong 
to D(F), the operator F is given by the matrix {6,;,}, where, as above, (192) 
and (197) hold, with a,, replaced by 6,,. Now suppose that a C-matrix {a,,} 
is given, instead of the operator. We take an arbitrary system of base vectors 
gy, of H and define the operator A’ by (192) or (197) on the lineal D(A’) of 
“finite elements’. The closure of A’ leads to some closed symmetric operator 
A. We shall say that A is generated by the matrix {a,,} and the system of base 
vectors y,, and we write A ~a,;,{p,}. As a matter of fact, any symmetric 
closed operator in H can be obtained in this way. 

THEOREM. Any symmetric closed operator A with D(A) dense in H can be 
generated by some C-matrix and system of base vectors pz. 

It is sufficient to form the corresponding base vectors gp, of D(A). The matrix 
is given by (191). These base vectors y;, must possess the following property: 
given any x of D(A), there exists a sequence w,, of ‘finite elements’’ such that. 
O,=> x and Aw,=> Ax. To obtain such g;, it is sufficient to form a sequence 
w, of D(A) such that, given any x of D(A), there exists a subsequence On,» Wn gues 
such that w,,=> x and Aw), = Aa. Orthogonalization of the w, obviously 
leads to the y,, where it follows from w,, =» x and the fact that D(A) is dense 
in H that the sequence w, is dense in H, so that the system of base vectors 
9, is complete. Let us turn to the formation of the w,. We take some sequence 
Xr Xo «+» Of elements dense in H. Let p, gq, r be any positive integer triple. 
If at any rate one element x of D(A) exists, such that || yp) — 2 || < I/r and 
ll 7 — Az || < 1/r, we associate one of these x with the above triple, and 
write 2» , ,. These elements can [1] obviously be enumerated; let us show that 
they have the properties required of w,. Let x € D(A) and e be any given positive 
number. We choose r satisfying 1/r < «/2, and elements yp, and xy, such that. 
| x» — || < I/r and || y, — Ax || < 1/r, which is possible, since x, are dense 
in H. There now exists an element a, 4, such that || zp — %p,4,r || < Ir 
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and || xg — A%p, g, 


|| = —Zp,q,r <I Xp —z ll + Il Xp — “p,q,r Il; 
\| Ax — Axp,q,r | < (| x4 — Az {| + \| Xq — Ap gr | 


and that I/r < é/2, we get: ||~ —a)4,||< ¢ and || Ax — Az, || < «. 
Hence it follows, since « is arbitrary, that x, 4 , possess the properties used 
above to characterize the sequence w,; the theorem is proved. 

The same closed symmetric operator can be generated by different matrices 
and base vectors. If A ~ ap. {p,} and A ~ b,,{y,}, and if we bring in the unitary 
operator U described above, and put up, = (¥, Pp), we get (194), (195) and 
(196). Notice also that, if F is a given symmetric closed operator, the system 
of base vectors gy; belongs to D(F), {a,;,} is the matrix defined by (191) with 
A replaced by F’, and A ~ a,x{9;,}, then F' either coincides with A or is an 
extension of A, as we have seen above. 


r {|< 1/r, whence, on observing that 


217. The unitary equivalence of C-matrices, We had in the last section two 
systems Gnx {Py} and bn, {y,}, generating the same operator A. The wp, were the 
elements of the matrix corresponding to the unitary transformation Ug, = yx, 
in the base vectors g,. The inner sum of (194) is the scalar product (Ay, 9,), 
and we can say, in view of the closure equation, that the square of the modulus 
of this sum forms a convergent series on summing over s. The inner sum in 
(195) is obtained from the inner sum of (194) by passing to the conjugates, 
interchanging s and ¢ and replacing k by n. What has been said on summing 
over ¢ is also true in thus case, i.e. 


2 


> Astlltk < 09, (198) 


t=1 


2 


z - -o. 
<0} S| S wnt 
s=l t=1]s=1 


This naturally leads to the following definition: 

DEFINITION 1. The unitary matrix {Up} ts said to be applicable to the C- 
matrix {apy}, if condition (198) is satisfied and if the repeated sums (194) and 
(195) lead to the same result. The resulting matrix {bn} is called the transformed 
matrix. 

It follows from the fact that {a,,} is a C-matrix and {u),} a unitary matrix 
that, by Cauchy’s inequality, the inner series of (194) and (195) are absolutely 
convergent, whilst (198) implies the convergence of the outer series also. In 
view of what has been said about the inner sums, one of conditions (198) implies 
the other condition. It follows at once from (194) and (195) that by, = bpp, 
and, in view of (198) and the fact that {wp } is a unitary matrix, the sum over 
k of the terms | },,, |? is finite, i.e. {b,,} is a C-matrix. Let us prove the following 
theorem: 

THEOREM 1. If the unitary matrix U is applicable to {a,,}, the inverse matrix 
U-* is applicable to {bp}, and the transformed matrix 18 {any}. 

We must show that 


> Upn Ong = > Opt UKs 2 Dik Uo = a Usa Seq (199) 
n=l t=1 k=1 s=1 
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It is enough to prove the first. The proof of the sccond is similar. We have: 


> Upn Ong = a [. (2 Ost Uy) tis] Upns 
n=l n=1 s=! t=1 

By (198), the sum in the curved brackets can be regarded as composing the 

&, of some element £ € l,. On writing y for the unitary operator in 1, realized 

by the matrix {u,,}, we can write the right-hand side of our last equation as 


oo n 
21 En tion = (YY Ip = Ep = 2 ape ty (200) 


whence the first of (199) follows immediately. A direct consequence of (199) 
is that 


Bg 2 Upp bing |” < 093 a | Bn tigu |? < 00 (201) 
k=l n=l n=l k=1 
For, on writing 7 = Bp we have an element 7 of 1, with components y,, 
and the right-hand side of the first of (199) can be written as (y~!7),, whence 
the first of (201) follows at once. The proof of the second is similar. On introdu- 
cing the further element 7’ of 1, with components 7, =6,,, we can write the first 
of (199) as (y7’)p = (y'n)x, Whence (y~ 4) = (y7’)p- On multiplying by u,_ 
and summing over k, we obtain 


Apq = 2, (2 Upn Onk) gk » 


i.e. one of formulae (196). The second formula is proved similarly; the theorem is 
proved. 

We shall now give a definition of the unitary equivalence of two systems 
An {Px} and by, {y,}, each of which contains a complete system of base vectors. 

DEFINITION 2. Two systems any {Py} and bn, {yy} are said to be unitary equi- 
valents if the unitary matrix U with elements upg = (Y4, Pp) 18 applicable to {an,} 
and leads to the transformed matrix {b,;}. 

If the conditions of the definition are satisfied, it follows from the last theorem 
that the unitary inverse to the matrix U with elements {up }, i.e. the matrix 
with elements ut, = Uz, = (¥q Yp), is applicable to {b,,} and leads to the 
transformed matrix {@,;,}, i.e. the unitary equivalence of two systems is a recip- 
rocal property. Notice also that Ug, = y, and U~) yy, = g,. The following 
is the fundamental theorem on equivalent systems. 

THEOREM 2. The necessary and sufficient condition for two systems any {py} 
and bry {yy} to be unitary equivalents is that the closed symmetric operators A 
and A, generated by them have the same symmetric operator F as their extensions. 

Let us first prove the necessity. Let the systems be unitary equivalents. 
We have: 


(Agp, Vq) a a (Agp, Ps) (Pss Yq) — 2, diptls ’ 
$= = 


(Pp: A, Yq) = 2, (Ps, At Yq) (Pp> Ws) = 2 bys Ups» 
= s= 
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and the second of (199) shows that (Ap, .) = (%p, A; Y,). The same equation 
obviously holds for finite linear combinations of ¢, and y,. On using the defini- 
tions of D(A) and D(A,) and passing to the limit in the scalar products, we 
obtain 


(Aw, y) =(@, Avy), if w€D(A) and y€D(A)). (202) 


If x belongs simultaneously to D(A) and D(A,), in addition to (202) we have 
(A, 2, y) = (#, A,y), and, by (202), (Aw — A,x,y) =0 for any y € D(A,), 
and, since D(A,) is dense in H, we have Ax = A, x. 

Let D(F) be the lineal of elements x expressible as x = x’ + y’, where x’ 
€ D(A) and y’ ¢ D(A,), and let us put Fx = Ar’ + A,y’. If we have the 
two forms: x = x’ + y’=2" + y”, where x’ and «” € D(A), and y’ and y’” 
€ D(A,), it follows from x2’ — x” = y” — y’ that x’ — x” and y” — y’ belong 
simultaneously to D(A) and D(A,), and, by what has been said, Ax’ — Az” = 
=A,y’ —A,y’, io. Ax’ + A,y’ = Ax” + A,y”, whence it follows that the 
definition of Fz is unique. By (202) and the symmetry of A and A,, F is a 
symmetric operator. It is evidently an extension of A and A,, and the necessity 
is proved. 

Now suppose that the symmetric operator F is an extension of A and A,. 
We have 


Ms: 


Ape Uae = 2 (AGe p) Vr Pt) = 21 ie P) (AM: 1) = 


t=} 


i] 


I 


(Vx, AGp) = (Yas F Pp) = (Fes Gp) = (Ar Pes Pp) = 
= Dy (Ay Px Pn) (Yr Pp) = > Upn Ong s 
n=1 n=l 


i.e. we have obtained the first of formulae (199). The second can be obtained 
similarly. On repeating the proof of Theorem 1, it may be seen that {upg} is 
applicable to {a,,}, and (194), (195) and (196) hold; the unitary equivalence 
of the systems is thus proved. 

It follows from the theorem that the unitary equivalence of two systems 
is completely determined by the closed symmetric operators generated by them, 
and the concept of unitary equivalence can be carried over in a natural way 
from systems to the closed symmetric operators generated by such systems. An 
immediate consequence of this is that the only unitary equivalent of a bounded 
operator is itself, and of a maximal operator part of itself. Unitary equivalence 
is a reciprocal property, but is not transitive, i.e. if an operator C, is the unitary 
equivalent of C, and C, the unitary equivalent of C,, it does not follow that C, 
is the unitary equivalent of C,. This will obviously be the case if C, is a maximal 
operator, since C, and C, here have the common extension C,. The general theory 
of C-matrices is substantially different from the theory of matrices correspond- 
ing to bounded self-conjugate operators. The theory of C-matrices is expounded 
in J. Neumann’s Zur Theorie der unbeschrankten Matrizen (Crelle, Journal, 
Bd. 161, 1929) and in Wintner’s book Spektraltheorte der unendlichen Matrizen 
(Leipzig, 1929). 
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218. The existence of the spectral function. Let us now turn to the proof 
of the fundamental theorem of [192], according to which, given any self-conjugate 
operator A, there exists a resolution of the identity &, such that A is expressible 
by the Stieltjes integral (58) of [192]. We can make use here of the properties 
of A that have been deduced without the aid of (58). We proved without 
the aid of this formula in [189] that, given any non-real A, there exists a 
bounded operator (A — AE)~!, defined throughout H, such that the formula 
x = (A — AE) u maps D(A) one-to-one onto H. Let us take the cases A = +7 
and put 


t=(A—ik)u, y=(A +E) v(u,v€ D(A)); 
and 
usa(A—ik)"'2, v=(A+iE) ly, (2, y€H). 


We have, since A is self-conjugate: 
[(A —iH)~*]* = (A + 74E)™. 


On introducing the bounded self-conjugate operators 


= (AB) 4 (A+B), Ba [(A— 6B — (A + 4)4), 
(203) 
we obtain 
(Ain) =O 44B, (A +468) 0 —@B, (204) 


where the elements Cz and Bz belong to D(A) for any choice of 2 € H. It fol- 
lows from (204) that 
(A — tE) (C0 +7B)=(AC 4+ B)+7(AB—C)=E, 
(4 +72) (O —iB) = (AC + B)—i(AB—C)= 8, 
whence 
AC = E— B; (205) AB=C., (206) 
If the element 2 belongs to D(A), we can also write 
(C+i7B)(A—if)x=a2 and (—tB)(A+7#)r¢=2, 
whence we obtain, on removing the brackets and comparing with the above: 
AOx=CAx, ABx+ BAz, («€ D(A)), (207) 
i.e. the bounded operators B and C commute with A in the sense of the defini- 
tion of [191]. 
It follows from BCO=—= BAB= ABB =OB that 


CB = BO, (208) 


i.e. B and C commute with each other. On using (205) and (206), we get B = 
= BE = B(B + AC) = B+ BAC= B+ ABC=B°+C?, ie. B is a 
positive operator. Further, it follows from 
llz||? = ||(A — ¢#)u|[? = ((A —tB)u, (A — 1B) u) = ||Aull? + |lull?, 
(Iyll? = || Aell? + lll? 
that || uw || < || 2|| and ||» || < |] y||, ie. the norms of the operators (A — 
— 1H)! and (A + ¢#)~! do not exceed unity, so that, by (203), the same can 
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be said for B and C. Let us also show that Br = 0 implies xz = 0. In fact, 
if Be = 0, then (Ba, x) = (B* x, xz) + (C* x, x) = 0,10. (Bx, Bx) + (Cz,Ox) = 0 
or || Bz ||? + || Ca ||? = 0, whence it follows that Cz = 0. Now, (205) gives 
zx = Be + ACzx = 0. In addition to the properties of B and C given above, 
we shall require a further lemma, the proof of which is given below. 

Lemma. Let M,, (n = 1, 2, ...) be mutually orthogonal subspaces, the orthogonal 
sum of which gives the whole of H. Further, let a self-conjugate bounded operator 
A,, be defined in each My). There now exists a unique self-conjugate operator A 
in H, coinciding with A, in M,. The lineal D(A) consists of the elements x for 
which the series formed from || A, xp, ||? 18 convergent, where x, has been written 
for the projection of x onto M,, and we have for these x: 


Aa = 3 An®y: (209) 
n 


Let us return to the operators B and C. The spectrum of B lies in the segment 
[0,1], and, since Bx = 0 implies x = 0, the point 4 = 0 does not belong to 
the point spectrum, so that, if we write @; for the spectral function of B, we 
have &; = 0. On writing M,, for the subspace onto which the operator (8), — 
— Eun +3)) Projects, we can say that the M,, are mutually orthogonal, and 
that their orthogonal sum is equal to H. If the bounded self-conjugate operator 
F commutes with €,, by the theorem of [148], M,, reduces F. This will hold 
for C and any real continuous function of B [193]. Let g,(A) = 1/A for 1/(m + 
+1)< A< l/n, and be equal to a constant outside this interval, whilst the 
continuity is preserved at the ends; let us introduce the self-conjugate bounded 
operator 9,,(B).Ifz € Mp, then é,z = zford > l/n andé,z = Oforéa < If(n + 
-+- 1). This follows at once from 


6,2=61(8; —8, )z 
nati 


and the fact that ¢/ ,—&) &, = &), for w< A. If we express y,(B)z and Bz by 
Stieltjes integrals in terms of &; and make use of the definition of y,,(4) and what 
has just been said regarding ¢,z, we find that 9,(B) B = By, (B) = # in 
M,,, ic. B and ¢,(B) are inverses in M,. If z € M,, we can write z = Bo,(B) z, 
whence it is clear that z ¢ M, implies z €¢ D(A). On using (206), we can write: 
Az = ABg,(B) z = Cy,(B) z, whence it is clear that A is a bounded operator 
in M,,. On also observing that C and 9,(B) commute and that M,, reduces C 
and 9,(B), we can assert that A is bounded and self-conjugate in M,. Let 
é% denote the resolution of the identity corresponding to this operator in M,,. 
We can form on the basis of the lemma, the self-conjugate operator &, in H: 


ge= X ef 2, (210) 
t= 


and it may easily be verified that €, is a resolution of the identity in H. If we 
form the Stieltjes integral, which defines a self-conjugate operator, 
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it will be seen, on noting that @, x = ay”) xifa ¢ M,, that it yields A ifxz ¢« M,, 
and hence, by the lemma, it defines the operator A. It remains to prove the 


lemma stated above. 
Let x belong to the lineal D(A) of the elements for which 


> |An @yl|? < 09 5 


we define an operator A by the formula Ax = A,2,4+ A,a, +... Let us 
show that A is self-conjugate. The lineal D(A) clearly contains finite sums of 
elements ¢,, 80 that D(A) is dense in H. The symmetry of A follows from the 
fact that A, is self-conjugate and 


(Az, y) = (2 A, Zn; 2 Yn) = P (An @p Yn) = 2 (tp An Yn) = 


= (2 Bn AnYn) = (a, Ay), 


where z and y € D(A) and use has been made of the continuity and distributive- 
ness of the scalar product, as also of the orthogonality of the subspaces M,,. 
Thus A* >A, and we have to show that, if 2 € D(A*), then x € D(A). On 
observing that A?xz, belongs to M,, and that x — a, | My, we can write 


(A* (& — X,), A&_) = (4 — Xp, A? zy) = 0, 
whence, by Pythagoras’ theorem, when n = 1, 
||A* |]? = |[A* (w — x) ||? + ||4a||?. 
We can similarly write 
; || A* (x — a) ||? = ||A* (x — 2, — 2) ||? + ||Ax, ||*, 
i.e. 
|| A* a||? = |]A* (2 — 2 — a2) ||? + || Aa, |? + [Aas |, 
and in general 


n 
||A* o||? = ||A* (@ — a —a,—... — aq) ||? + 2 |Aary ||*, 
ke 
whence it follows that 


Py |] An 2, ||? < || A* x |l?, 


so that x € D(A); we have proved that A is self-conjugate. It remains to show 
that there exists a unique self-conjugate operator coinciding with A, on M,. 
Suppose that, in addition to the A obtained, there exists a further operator 
A’. Since A’ is self-conjugate, it must be closed. We have for finite sums: 


m m m 

A’ 2. ry=A > p= Ba Ax, 

k=l k=1 k=l 

since A and A’ coincide on M,,. In view of the fact that A’ is closed, we can say 
that A’ is defined on D(A) and coincides on this lineal with A, i.e. A’ DA. 
On the other hand, on replacing A* by A’ in the proof given above, we can con- 
clude that, if x € D(A’), then x ¢ D(A), so that A’ coincides with A; the 
lemma is proved. 
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